Empowering differential networks using Bayesian analysis

Jarod Smith; Mohammad Arashi; Andriëtte Bekker

doi:10.1371/journal.pone.0261193

Abstract

Differential networks (DN) are important tools for modeling the changes in conditional dependencies between multiple samples. A Bayesian approach for estimating DNs, from the classical viewpoint, is introduced with a computationally efficient threshold selection for graphical model determination. The algorithm separately estimates the precision matrices of the DN using the Bayesian adaptive graphical lasso procedure. Synthetic experiments illustrate that the Bayesian DN performs exceptionally well in numerical accuracy and graphical structure determination in comparison to state of the art methods. The proposed method is applied to South African COVID-19 data to investigate the change in DN structure between various phases of the pandemic.

Citation: Smith J, Arashi M, Bekker A (2022) Empowering differential networks using Bayesian analysis. PLoS ONE 17(1): e0261193. https://doi.org/10.1371/journal.pone.0261193

Editor: Marton Karsai, Central European University, HUNGARY

Received: June 1, 2021; Accepted: November 24, 2021; Published: January 25, 2022

Copyright: © 2022 Smith et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying the results presented in the study are available from https://archive.ics.uci.edu/ml/datasets/spambase for the spambase dataset. The corresponding COVID-19 data are available from https://www.nicd.ac.za/diseases-a-z-index/disease-index-covid-19/surveillance-reports/ and https://ourworldindata.org/coronavirus/country/south-africa.

Funding: This work was based upon research supported in part by the National Research Foundation (NRF) of South Africa, SARChI Research Chair UID: 71199; Ref.: SRUG190308422768 grant No. 120839. The opinions expressed and conclusions arrived at are those of the authors and are not necessarily to be attributed to the NRF. The research of the corresponding author is supported by a grant from Ferdowsi University of Mashhad (N.2/55265).

Competing interests: The authors have declared that no competing interests exist.

Abbreviations:

Introduction

Probabilistic networks are becoming ever-present in a multitude of scientific disciplines. These networks aim to illustrate the relationships, if any, between the components of complex systems [1]. If the data is assumed to be Gaussian distributed with mean μ and covariance matrix Σ; the precision matrix Θ ≔ {θ_ij}, defined as the inverse of the covariance matrix Θ ≡ Σ⁻¹, directly determines the conditional dependence relations and structure of the Gaussian undirected graphical model [2].

Differential network (DN) analysis is a statistical methodology that involves functions of at least two graphical models. Let define a graphical model with nodes and a set of edges . The graph visually depicts the conditional dependence structure between the nodes of the system. The adjacency matrix associated to a graphical model is the binary encoded p × p precision matrix where the entries of the matrix are equal to 1 if the corresponding precision matrix entry is nonzero and zero otherwise. Nonzero adjacency matrix entries indicate an edge between corresponding nodes of . For this work, the focus will be on the difference of two Gaussian graphical models (GGM), and that share the same set of nodes . In particular, the edge sets given here are equivalent to the adjacency matrices obtained from the GGM estimation. More specifically, assume that the observations, and are generated from a p variate Gaussian distribution, N_p(μ₁, Σ₁) and N_p(μ₂, Σ₂), respectively, where n₁ and n₂ indicate the respective sample sizes that need not be equal. The interest here is estimating the DN (), that is the difference between two precision matrices. Numerous measures exist for comparing and evaluating the differences between graphical structures [1]. DN analysis is becoming increasingly popular and important, for example in biological systems where protein interaction networks can be utilised as informative biosignatures for prevalent diseases [3, 4]. The fundamental idea here is that, if two molecules interact with one another then a statistical dependency between them should be observed. Additionally, another application of DNs is multivariate statistical quadratic discriminant analysis [5, 6], under the Gaussian distribution assumption.

A key component of DN analysis is the estimation of covariance and precision matrix components. Numerous statistical matrix estimation, as well as graphical model determination methods exist within literature. In particular, from a frequentist approach [7], introduce a computationally efficient neighborhood selection procedure. The lasso is used for covariance estimation which enjoys consistency for sparse high-dimensional graphs. The approach is quite effective, in that the sparse precision matrix is estimated by fitting the lasso to each variable using the remaining as predictors. Finally, the estimated precision matrix entry (θ_ij) is non-zero if the estimated coefficient of i on j or vice versa is non-zero. Importantly, their algorithm can consistently estimate the set of non-zero entries in Θ, [8]. For a penalised likelihood methodology for sparse precision matrix estimation see [9, 10]. More so [11], estimate the undirected graphical model using both a block coordinate descent algorithm, as well as Nesterov’s first order method [12]. Additionally [13], propose a ℓ₁ constraint estimation technique for both sparse and non-sparse high dimensional matrices with applicability on a wide range of sparsity patterns and class of matrices; precision estimation in GGMs for example. For a joint graphical model estimation approach see [14, 15].

Fully Bayesian treatments of GGM estimation are, also, well rooted in literature. In particular [16], introduce the Bayesian adaptive graphical lasso (BAGLASSO) which utilises a generalised Pareto distribution in the hierarchical formulation of the Bayesian graphical lasso. [17] provide a method for graphical model determination by invoking positive prior mass on the event that there is no conditional dependencies between variables. In terms of joint graphical model inference from a Bayesian perspective see [18]. Lastly [19], propose using Kullback-Leibler divergence and cross-validation for graphical model structure estimation.

Background

Recently, a plethora of statistical techniques have emerged for estimating DNs. These techniques can largely be classified into two main categories. The first estimating the individual precision matrices, Θ₁ and Θ₂ separately; where the estimated DN is the difference between the estimated precision matrices. For example, the methods and references for GGM estimation outlined in the introduction can be used to directly estimate Δ. The second methodology estimates both the precision matrices simultaneously. The approach here, typically penalises a joint loss function for both precision matrices. [20] provide a methodology for inference and estimation of functions of GGMs. In particular, the Intertwined Graphical Lasso (IGL) approach biases the estimation of the precision matrices towards a common value. More so, their Graphical Cooperative Lasso (GCL) utilises a group-penalty for solutions that favour a common sparsity pattern. [14, 21] estimate separate graphical models using a joint penalised loss function. [22] propose a method for estimating Δ directly which relaxes the need for both individual precision matrices to be sparse nor be estimated directly. Similarly [6, 23], utilise an alternating direction method of multipliers (ADMM) algorithm for estimating Δ from their joint ℓ₁ penalised convex loss function. More recently [24], introduce a computationally efficient iterative shrinkage-thresholding algorithm for minimising the ℓ₁ loss function defined in [6], namely (1) is convex and S₁ and S₂ are the sample covariance matrices. The DN estimate is obtained by minimising the penalised loss Eq (1). An analogous symmetric convex loss function and estimator is proposed by [23].

The shrinkage-thresholding algorithm proposed by [24], based on the fast-iterative shrinkage-thresholding algorithm in [25], aims to minimise Eq (1). The objective function is given by where . The lasso tuning parameter, ρ, controls the strength of the penalty term and resultantly the amount of shrinkage (precision matrix entries shrunk towards zero) too. The optimisation objective converges to the solution sequentially using a quadratic approximation and a gradient descent algorithm. The efficiency of the procedure is attested to this approach, resulting in superior computational complexity in contrast to the ADMM approaches by [6, 23]. To conclude this section it is worth noting that the iterative shrinkage-thresholding method will be used for experimental comparison later.

The main contributions of this study are as follows.

A framework for Bayesian DN estimation is developed. That is, the DN is estimated by separately estimating each Gausian graphical model, referred to as the components.
The graphical lasso is applied as the thresholding method in the Bayesian precision matrix estimation in order to efficiently capture sparse patterns in the DN, hence developing the BAGLASSO. A threshold selection strategy, based on a conjugate Wishart prior, that accommodates both dense and sparse graphical structures determination is explored. The aforementioned strategy, applied to each component of the DN, ensures an accurately sparse DN estimate.
The proposed Bayesian DN efficiently improves the existing classical DN estimation for a number of known network structures.
An R package for the BAGLASSO block Gibbs sampler has been developed for the interested practitioner and is available on The Comprehensive R Archive Network (CRAN) as abglasso.

The Bayesian DN

A fully Bayesian treatment of DNs remains unexplored and the novel methodology here aims to develop a simple yet highly accurate Bayesian DN estimation procedure. The novel contribution utilises the BAGLASSO as a launching point to separately estimate the components of the DN. The subsections that follow develop the framework for individual component estimation from a Bayesian viewpoint. Moreover, the framework has been develop for low p = 10 to moderate, p = 50 − 100, dimensions where n ≥ p.

The Bayesian graphical lasso prior

Recall that the graphical lasso objective is maximising the penalized log-likelihood where (2) and M⁺ is the space of positive definite matrices, S is the sample covariance matrix and n the sample size, respectively. More over, ρ ≥ 0 is the shrinkage parameter and Θ = (θ_ij) is the precision matrix. The Bayesian connection to the graphical lasso problem is the maximum a posteriori (MAP) estimate, assuming a random sample from N_p(μ, Θ⁻¹), of the following (3) The prior distribution is given by the product of a double exponential (DE) with form p(y) = λ/2 exp(−λ|y|) for the off diagonal elements and an exponential (EXP) with form p(y) = λ exp(−λy)1_{y > 0}, otherwise. The value of Θ which maximizes the posterior density is the graphical lasso estimate in Eq (2) when ρ = λ/n. Within the Bayesian context λ is treated as the shrinkage parameter. The formulation and interpretations of the graphical lasso prior in Eq (3) have been studied in [26]. The aim therein is the development of varying regularization to infer block structures within the graphical models and efficiently estimating the maximum a posteriori of the corresponding posterior distribution. [16] make use of this prior formulation for the convenience (scale mixture of Gaussian formulation of the double exponential) in the development of their efficient block Gibbs sampler, in addition to allowing for the use of a gamma hyperprior on the shrinkage parameter λ for improved precision matrix estimation.

Hierarchical representation

The Gibbs sampler for sampling the precision matrix Θ from the posterior distribution, defined below in Eq (5), associated with the prior in Eq (3), is constructed using a hierarchical representation of Eq (3). This particular hierarchical representation of the prior in Eq (3) is presented by [16], whom follow the same approach as in the development of the Gibbs sampler for the Bayesian lasso in [27]. The Gibbs sampler in [27] utilises the structure of the double exponential distribution as a scale mixture of Gaussians, assuming independence of the conditional double exponential priors [28, 29], in their hierarchical representation to simulate regression parameters from the desired posterior distribution. The positive definite constraint on Θ in Eq (3) implies that the Gaussian components for θ_ij (DE parameters) in the scale mixture formulation are no longer independent given the scale parameters. To address this issue, the hierarchical representation of the graphical lasso prior in Eq (3) is given by (4) where θ ≔ {θ_ij}_i≤j is a vector of the upper triangular matrix entries of Θ and τ = {τ_ij}_i<j the scale parameters. The normalising constant, C_τ, has no closed-form solution. Obtaining the marginal distribution Eq (3), [16] propose a mixing density proportional to an exponential density with rate parameter λ²/2 and simple substitution circumvents the intractable normalising constant. Finally, the hierarchical representation in Eq (4) is used in the development of the block Gibbs sampler, available in the S1 File, with a target posterior distribution given by (5)

BAGLASSO

It is well known that the double exponential prior in Eq (3) may over-shrink (under-shrink) large (small) coefficients in Θ. The limitations within a regression context have been studied in [30–32] with alternative proposals. The BAGLASSO, Bayesian analog to the adaptive graphical lasso [33], exploits the framework and flexibility of the hierarchical representation in Eq (4) to address the aforementioned limitation. This extension serves to improve the accuracy of the precision matrix estimates obtained from the posterior in Eq (5) by allowing for different shrinkage parameters λ_ij for each corresponding off-diagonal precision matrix entry θ_ij. Recall that the adaptive graphical lasso is given by where (6) and for α > 0 are the adaptive weights and the weight matrix () is the sample precision matrix.

The form of the Bayesian graphical lasso in Eq (3) enables the selection of an appropriate hyperprior on the shrinkage parameter λ, recall that ρ = λ/n in the Bayesian formulation of Eq (2). Adhering to the positive definite constraint on Θ, the prior normalising constant in Eq (3) when a single λ is applied to all elements in Θ can be obtained by applying the substitution . Thereafter, a gamma prior λ ∼ GA(r, s) and corresponding conditional posterior λ ∼ GA(r + p(p + 1), s + ‖Θ‖₁/2) can be obtained and sampled from. When allowing for individual λ_ij’s for different off-diagonal θ_ij’s, the normalising constant C will inevitably depend on λ_ij. To address this a hierarchical formulation can be used to construct a set of prior distributions, serving as the the extension of the graphical lasso prior in Eq (3), for various λ_ij that mitigate the complications associated with posterior simulation due to the intractable normalising constant. This extension is the BAGLASSO and, assuming a random sample from N_p(μ, Θ⁻¹), is given by (7) The normalising constant is intractable, as mentioned above, and the set are hyperparameters for the diagonal elements of Θ. Simple substitution yields that computation of λ_ij is simplified by circumventing the intractable normalising constant.

The BAGLASSO selects the amount of shrinkage λ_ij proportionally to the current value of θ_ij. To see this [16], demonstrate that the conditional posterior, λ_ij | Θ ∼ GA(r + 1, |θ_ij| + s), has an expected value that is inversely related to magnitude of θ_ij. The data augmented block Gibbs sampler for the hierarchical representation in Eq (7) is the fundamental building block upon which the novel Bayesian DN is devised.

Technicalities on conditional dependencies

Recall that the precision matrix directly determines the conditional dependence relations and structure of the undirected graphical model. Therefore, correctly estimating the precision matrices with sparse structures is essential to adequately gauge the conditional dependency relations between variables. The task to estimate the precision matrix for both n < p and p ≤ n remains challenging and regularization is often required [34–36]. A popular choice of prior for Bayesian posterior inference regarding network structure is the conjugate Wishart [37]. An alternative thresholding strategy is presented which is an adaption of the recommendation by [32]. In particular the conjugate Wishart W(3, ϵ I_p) prior is used. The corresponding posterior is W(3 + n, (S + ϵ I_p)⁻¹), where ϵ = 0.001 and I_p a p dimensional identity matrix. The posterior samples are used to compute the posterior distribution of the p × p partial correlation matrix P ≔ {ρ_ij}. The recommended strategy here suggests θ_ij ≠ 0 for i ≠ j if (8) where η may vary depending on the underlying graph structure. The Bayesian posterior thresholding recommendation by [16] claim that θ_ij ≠ 0 for i ≠ j if and only if (9) Noting that is the posterior sample mean estimate of the partial correlation under graphical lasso priors in Eq (3); g is the standard conjugate Wishart W(3, I_p) and h the standard conjugate Wishart W(3, ϵ I_p). Moreover, η ∈ [0, 1] with the lower and upper bounds resulting in a completely dense or sparse estimate, respectively.

The original recommendation for η in Eq (9) is 0.5. The forthcoming synthetic data analysis section describes the simulation procedure, as well as, illustrates the performance of the Bayesian DN with regards to different graph structures, namely an AR(1), AR(2), sparse random, scale-free, band, cluster, star and circle. The goal here is to suggest a suitable sparsity threshold region under the varying graph structures for the recommended sparsity criterion in Eq (8). The Bayesian DN is applied across all graph structures with thresholds, η, in the range of 0.2 and 0.6 in increments of 0.02. The absolute sparsity error is computed for each graph structure for each Bayesian sparsity criterion in Eqs (8) and (9), respectively. The results are based on the median of 40 replications and the Matthews Correlation Coefficient (MCC), see [38], is used to determine the best performing threshold. Fig (1a)–(1i) display the optimal threshold, based on the top performing MCC, for each graph structure and Bayesian sparsity criterion for p = 10. The optimal threshold plots for p = 30 and p = 100 are available in the S1 File. The optimal threshold based on Eq (8), η*, for the Bayesian DN is, in most cases, in the neighborhood of the minimum absolute sparsity error and in the region of η* ∈ {0.2 − 0.4}. Both Bayesian sparsity criterion candidates perform comparably well noting, however, that Eq (8) requires less computation.

Download:

Fig 1. Optimal Bayesian sparsity threshold selection.

The median of the absolute sparsity error and best performing MCC for various graph structures under varying thresholds for each Bayesian sparsity criterion in Eq (9) (dotted) and Eq (8) (dot-dash) for dimension p = 10. The best performing threshold is indicated by a vertical line with the accompanying MCC value displayed in the legend. (a) Model 1: AR(1). (b) Model 2: AR(2). (c) Model 3: at most 80% sparse. (d) Model 4: at most 40% sparse. (e) Model 5: scale-free. (f) Model 6: band. (g) Model 7: cluster. (h) Model 8: star. (i) Model 9: circle.

https://doi.org/10.1371/journal.pone.0261193.g001

Synthetic data analysis

The synthetic experiment is designed to test the parameter estimation and graphical structure determination of the DN estimation for both the novel Bayesian approach (referred to as ‘B-net’) and the iterative shrinkage-thresholding estimator (referred to as ‘D-net’) from [24]. The iterative shrinkage-thresholding estimator uses the lasso penalty and Bayesian Information Criterion (BIC) for model estimation and selection, respectively. For all simulations, the assumption is that the observations, and are generated from a Gaussian N_p(0, Σ₁) and N_p(0, Σ₂) respectively. The true DN is where the true precision matrices are and . The Bayesian DN applies the BAGLASSO Eq (7) to each sample, i.e. separately estimates the precision matrices. Furthermore, for excellent performance set r = 10⁻² and s = 10⁻⁶, see S1 File for more details, for the hyperparameters of the prior distributions of λ_ij for i < j and λ_ii = 1 for i = 1, …, p. The iterative shrinkage-thresholding approach jointly estimates the precision matrices for Eq (1). The following 9 graphical structure variations are considered—where the structure of each is applied to each component in the DN’s composition to achieve the desired structure in the DN itself—in the simulation:

structure 1: An AR(1) model.
- Component 1: θ_ij = 0.7^|i−j|.
- Component 2: θ_ij = 0.75^|i−j|.
structure 2: An AR(2) model.
- Component 1: θ_ii = 0.1, θ_i,i−1 = θ_i−1,i = 0.05 and θ_i,i−2 = θ_i−2,i = 0.025.
- Component 2: θ_ii = 1,θ_i,i−1 = θ_i−1,i = 0.5 and θ_i,i−2 = θ_{i−2, i} = 0.25.
structure 3: A sparse random model where both components have approximately up to 80% off-diagonal elements set to zero.
structure 4: A moderately sparse random model where both components have approximately up to 40% off-diagonal elements set to zero.
structure 5: A scale-free model where the second component is a scalar multiple of the first.
structure 6: A band or diagonal model.
- Component 1: θ_ii = 1, θ_ij = 0.2 for 1 ≤ i ≠ j ≤ p/2, θ_ij = 0.5 for p/2 + 1 ≤ i ≠ j ≤ p and θ_ij = 0 otherwise.
- Component 2: θ_ii = 1, θ_ij = 0.7 for 1 ≤ i ≠ j ≤ p/2, θ_ij = 0.9 for p/2 + 1 ≤ i ≠ j ≤ p and θ_ij = 0 otherwise.
structure 7: A cluster model containing two disjoint groups.
- Component 1: θ_ii = 1, θ_ij = 0.5 for 1 ≤ i ≠ j ≤ p/2, θ_ij = 0.5 for p/2 + 1 ≤ i ≠ j ≤ p and θ_ij = 0 otherwise.
- Component 2: θ_ii = 1, θ_ij = 0.9 for 1 ≤ i ≠ j ≤ p/2, θ_ij = 0.9 for p/2 + 1 ≤ i ≠ j ≤ p and θ_ij = 0 otherwise.
structure 8: A star model with every node connected to the first node.
- Component 1: θ_ii = 1, θ_1,i = θ_i,1 = 0.1 and θ_i,j = 0. otherwise.
- Component 2: θ_ii = 1, θ_1,i = θ_i,1 = 2.1 and θ_i,j = 0. otherwise.
structure 9: A circular model.
- Component 1: θ_ii = 2, θ_i,i−1 = θ_i−1,i = 1 and θ_1,p = θ_p,1 = 0.45.
- Component 2: θ_ii = 4, θ_i,i−1 = θ_i−1,i = 2 and θ_1,p = θ_p,1 = 0.95.

The sample sizes and dimensions for each model are n₁ = n₂ ∈ {50, 100, 200} and p₁ = p₂ ∈ {10, 30, 100}, respectively. The Bayesian estimates are based on 10000 Monte Carlo iterations after 5000 burn-in iterations. To assess the performance of DN matrix estimation, six loss functions are considered and defined in Table 1, where p denotes the dimension and γ_i the i^th eigenvalue, respectively. Notice that some loss functions utilise the true DN matrix and its estimates, while others utilise the eigenvalues and their respective estimates. Table 2 reports the median of L1, L2, EL1, EL2, MAXEL1 and MINEL1 for p = 10, 30, 100 in structures 1−9 based on 40 replications. For each scenario, the best performing measure is boldfaced.

Download:

Table 1. Loss functions used in the synthetic data analysis to assess the numerical accuracy of the B-net and D-net estimates.

https://doi.org/10.1371/journal.pone.0261193.t001

Download:

Table 2. Synthetic study median loss results.

https://doi.org/10.1371/journal.pone.0261193.t002

The eigenvalue based loss functions are designed to investigate the extremes of the eigenvalue spectrum. In particular, the MAXEL1 loss function highlights which estimator is favourable in a principal component setting, [39]. A couple of observations are worth noting from Tables 2 and 3. First, the D-net estimator performs better with the AR(1) structure. Second, the B-net estimator performs exceptionally well in remaining structures. Third, the standard errors for both DN estimation techniques remain relatively consistent throughout the dimension spectrum considered, noting that the D-net estimator yields, in general, better results. This may be due to the fact that the best performing tuning parameter in the solution path leads to highly sparse estimates. The B-net estimation procedure inherits the utilisation of multiple penalty parameters in the precision matrix estimation, leading to robust estimation of the precision matrices.

Download:

Table 3. Synthetic study standard error loss results.

https://doi.org/10.1371/journal.pone.0261193.t003

To assess the performance on graphical structure determination, the specificity, sensitivity, false negative rate, f1 score and the MCCs are computed and defined in Table 4. Noting that, TP, TN, FP and FN denote the number of true positives, true negatives, false positives and false negatives, respectively. Values of specificity, sensitivity, f1-score and MCC closer to one, imply better classification performance. The closer the values of false negative rate are to zero the better. Further insights on the performance metrics are discussed in [40]. The sparsity for the B-net estimator is determined by the thresholding rule in Eq (8) and the thresholds, η, associated with the MCC values in Fig (1a)–(1i). Similarly, the best performing tuning parameter in the solution path of the D-net algorithm determines the sparsity of the estimator. The median performance scores, based on 40 repetitions, for each graphical structure is presented in Table 5. The main diagonals of the adjacency matrices were not included in the scoring.

Download:

Table 4. Performance measures used to assess classification accuracy of the B-net and D-net graphical models estimates.

https://doi.org/10.1371/journal.pone.0261193.t004

Download:

Table 5. Synthetic study median performance results.

https://doi.org/10.1371/journal.pone.0261193.t005

The B-net estimator generally outperforms the D-net estimator across all models and all dimensions according to the MCC, f1-score, sensitivity and false negative rate, with the exception of the star case for p = 100. Fig (3a)–(3i) display the true and inferred undirected DN graphs for both the B-net and D-net estimators for p = 10; higher dimensions are available in the S1 File. Lastly, Fig (2a)–(2i) display the true and inferred adjacency matrices for p = 10. Both Figs 2 and 3 visually demonstrate the superiority of the B-net estimator.

Download:

Fig 2. DN adjacency matrix heatmaps.

Comparison of the true DN, B-net and D-net adjacency matrices for an AR(1), AR(2), sparse random, scale-free, band, cluster, star and circle graphical model and p = 10. (a) Model 1: AR(1). (b) Model 2: AR(2). (c) Model 3: at most 80% sparse. (d) Model 4: at most 40% sparse. (e) Model 5: scale-free. (f) Model 6: band. (g) Model 7: cluster. (h) Model 8: star. (f) Model 9: circle.

https://doi.org/10.1371/journal.pone.0261193.g002

Download:

Fig 3. DN adjacency matrix graphical models.

Comparison of the true DN, B-net and D-net graphical structure estimates for an AR(1), AR(2), sparse random, scale-free, band, cluster, star and circle graphical model and p = 10. (a) Model 1: AR(1). (b) Model 2: AR(2). (c) Model 3: at most 80% sparse. (d) Model 4: at most 40% sparse. (e) Model 5: scale-free. (f) Model 6: band. (g) Model 7: cluster. (h) Model 8: star. (i) Model 9: circle.

https://doi.org/10.1371/journal.pone.0261193.g003

Real data analysis

This section focuses on applying the novel Bayesian DN estimator, B-net, as well as the terative shrinkage-thresholding estimator, D-net, to the spambase dataset, available at https://archive.ics.uci.edu/ml/datasets/spambase to investigate changes in DN structure between spam and non-spam data. In addition, the B-net estimator is applied to South African COVID-19 data, obtained from https://www.nicd.ac.za/diseases-a-z-index/disease-index-covid-19/surveillance-reports/, https://ourworldindata.org/coronavirus/country/south-africa and https://mediahack.co.za/datastories/coronavirus to investigate the change in DN structure between various phases of the pandemic.

Spam data

The objective here is to compare the B-net and D-net graphical model estimates of the spam and non-spam emails. The dataset consists of 1813 spam emails and 2788 non-spam emails. The attributes include, amongst others, the average length of uninterrupted sequences of capital letters; total number of capital letters in the e-mail; an indicator denoting whether the e-mail was considered spam or not, in this study.

Following the approach of [24], the data is standardised using a non-paranormal transformation in order to satisfy the Gaussian assumption. The B-net estimates are based on 10000 iterations of the Monte Carlo sampler after 5000 burn-in iterations. Fig 4 illustrates the difference between the B-net and D-net estimates. Both estimators indicate the presence of several common hub features namely, ‘edu’, ‘original’, ‘direct’, ‘lab’, ‘telnet’ and ‘addresses’. It is clear from both panes that the state of the covariance matrix structure between the spam and non-spam emails may very well be different. Furthermore, given that Hewlett-Packard Labs donated the data, words such as ‘telnet’ and ‘hp’ appear more often in the non-spam emails and can be used to distinguish between spam and non-spam emails.

Download:

Fig 4. A comparison of the D-net and B-net DN estimates for the spam emails dataset.

(a) The Bayesian DN for the spam emails dataset. (b) The iterative-shrinkage DN for the spam emails dataset.

https://doi.org/10.1371/journal.pone.0261193.g004

South African COVID-19 data

The 2019 novel coronavirus (COVID-19) has affected more than 180 countries around the world, including South Africa. The current body of knowledge boasts a wealth of statistical literature that aims at empowering researchers to study and alleviate the impact of the disease, see for example [41]. Understanding the interaction of key metrics and attributes between various phases, cycles or waves of the pandemic may prove to be invaluable in strategic planning and prevention. The goal here, is to use the Bayesian DN, B-net, to illustrate that the interactivity of key daily metrics between suspected homogeneous and heterogeneous phases within the pandemic life cycle is ever changing. In particular, the B-net is used to model the interactivity of daily metrics between the first two peaks or waves; the first wave and the following plateau and finally the difference between the first and second post wave plateaus. The data consists of 446 observations from the 7^th of February 2020 to the 27^th of April 2021. The daily metrics include, deaths; performed tests; positive test rate; active cases; tests per active case; recoveries; hospital admissions; hospital discharges; ICU admissions and the number of ventilated patients. It should be noted that no sensitive patient information is used, however, the interested reader is referred to [42] for a detailed treatment and framework for dealing with and sanitizing medical data containing sensitive patient information. Due to the irregularities in data capturing and publishing, a seven day moving average is applied across all daily metrics. The data is standardised using a non-paranormal transformation in order to satisfy the Gaussian assumption. The B-net is applied to the data using 10000 iterations of the Monte Carlo sampler after 5000 burn-in iterations.

Fig 5 highlights the temporal nature of the pandemic between suspected homogeneous and heterogeneous phases. In other words, comparing the cyclical behaviour of individual daily metrics may seem clearly distinctive over time; a peak or wave is always followed by a plateau. Furthermore, extrapolation of the temporal behaviour of individual daily metrics may incorrectly allude to distinct multi modality of multiple daily metrics. Upon observing multiple metrics simultaneously, the crisp group-wise multi modality diminishes rather rapidly. The figures in Fig 6 illustrate the higher proportions of hub features present in the DNs. Interestingly, the Bayesian DN provides insight to the change in interaction between daily metrics between perceived homogeneous pandemic phases, that is comparisons between the two peaks and two post-peak plateaus. This change in behaviour could be as a result of the change in population adherence to public sanitation awareness; weather conditions; virus mutations or complacency of over time.

Download:

Fig 5. South African COVID-19 daily metrics over time.

7-day moving average filled area line plots with standardised counts for daily new cases; deaths; tests; positive test rate; active cases; tests per active case; recoveries; hospital admissions; hospital discharges; ICU admissions and ventilated patients.

https://doi.org/10.1371/journal.pone.0261193.g005

Download:

Fig 6. Bayesian DN estimates of South African COVID-19 data.

The Bayesian DN and corresponding BAGLASSO graphical models between the first two waves; the first wave and the following plateau and finally the difference between the first and second post wave plateaus. The p−values from the Box’s M-test for homogeneity of covariance matrices between the contributing precision matrices were all less than 0.001 [43].

https://doi.org/10.1371/journal.pone.0261193.g006

Discussion

The Bayesian differential network estimator is the first of its kind which utilises the excellent graphical structure determination and matrix estimation of the Bayesian graphical lasso [16]. In comparison with the state of the art iterative shrinkage-thresholding approach, the Bayesian differential network offers MCMC outputs that allow the user to gain deeper insight and inference in the estimation procedure. The numerical accuracy of the Bayesian differential network is, in general, superior to that of the iterative shrinkage-thresholding estimator. Moreover, the Bayesian proposal captures both sparse and dense precision matrix patterns in some well-known graphical structures more accurately. The latter being a result of the Wishart prior’s ability to accommodate the variability and adjustment to the data. Furthermore, the thresholding technique for sparse estimation is designed such that it accounts for the effect of prior allocation through the posterior expectation.

The graphical structure learning is a crucial component of the Bayesian differential network estimator. The ad hoc approach provided in Eq (8) suggests a suitable sparsity threshold under varying graph structures. The Bayesian differential network also provides key insights to changes in the interactive behaviour of real data metrics ranging from filtering spam emails to COVID-19 life cycles. For high-dimensional data, the block Gibbs sampler may be adjusted to incorporate the singular normal distribution presented in [44] in the hierarchical representation Eq (7). Furthermore, research on simultaneous Bayesian estimation and optimisation of both and in the construction of the differential network is underway.

Supporting information

S1 File. Supplementary material.

Contains a block Gibbs sampler, as well as, additional optimal threshold; adjacency heatmaps and graphical network figures for dimensions p = 30 and p = 100.

https://doi.org/10.1371/journal.pone.0261193.s001

(PDF)

Acknowledgments

We would like to sincerely thank both anonymous reviewers for their generous comments on the manuscript. The astute feedback was most welcomed, insightful and above all greatly improved the presentation, scientific justification and readability of the paper.

References

1. Shojaie A. Differential network analysis: A statistical perspective. Wiley Interdisciplinary Reviews: Computational Statistics. 2020; p. e1508.
- View Article
- Google Scholar
2. Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. MIT press; 2009.
3. Chuang H, Lee E, Liu Y, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Molecular Systems Biology. 2007;3(1):140. pmid:17940530
- View Article
- PubMed/NCBI
- Google Scholar
4. Taylor I, Linding R, Warder-Farley D, Liu Y, Pesquita C, Faria D, et al. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nature Biotechnology. 2009;27(2):199–204. pmid:19182785
- View Article
- PubMed/NCBI
- Google Scholar
5. Li Q, Shao J. Sparse quadratic discriminant analysis for high dimensional data. Statistica Sinica. 2015;25:457–473.
- View Article
- Google Scholar
6. Jiang B, Wang X, Leng C. A direct approach for sparse quadratic discriminant analysis. The Journal of Machine Learning Research. 2018;19(1):1098–1134.
- View Article
- Google Scholar
7. Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the lasso. The Annals of Statistics. 2006;34(3):1436–1462.
- View Article
- Google Scholar
8. Lauritzen S. Graphical models. Oxford: Clarendon Press; 1996.
9. Yuan M, Lin Y. Model selection and estimation in the Gaussian graphical model. Biometrika. 2007;94(1):19–35.
- View Article
- Google Scholar
10. Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9(3):432–441. pmid:18079126
- View Article
- PubMed/NCBI
- Google Scholar
11. Banerjee O, Ghaoui LE, d’Aspremont A. Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data. The Journal of Machine Learning Research. 2008;9:485–516.
- View Article
- Google Scholar
12. Nesterov Y. Smooth minimization of non-smooth functions. Mathematical Programming. 2005;103(1):127–152.
- View Article
- Google Scholar
13. Cai T, Liu W, Luo X. A constrained ℓ1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association. 2011;106(494):594–607.
- View Article
- Google Scholar
14. Guo J, Levina E, Michailidis G, Zhu J. Joint estimation of multiple graphical models. Biometrika. 2011;98(1):1–15. pmid:23049124
- View Article
- PubMed/NCBI
- Google Scholar
15. Danaher P, Wang P, Witten D. The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society: Series B (Methodological). 2014;76(2):373–397. pmid:24817823
- View Article
- PubMed/NCBI
- Google Scholar
16. Wang H. Bayesian graphical lasso models and efficient posterior computation. Bayesian Analysis. 2012;7(4):867–886.
- View Article
- Google Scholar
17. Banerjee S, Ghosal S. Bayesian structure learning in graphical models. Journal of Multivariate Analysis. 2015;136:147–162.
- View Article
- Google Scholar
18. Peterson C, Stingo F, Vannucci M. Bayesian inference of multiple Gaussian graphical models. Journal of the American Statistical Association. 2015;110(509):159–174. pmid:26078481
- View Article
- PubMed/NCBI
- Google Scholar
19. Williams D, Piironen J, Vehtari A, Rast P. Bayesian estimation of Gaussian graphical models with predictive covariance selection. arXiv preprint arXiv:180105725. 2018;.
20. Chiquet J, Grandvalet Y, Ambroise C. Inferring multiple graphical structures. Statistics and Computing. 2011;21(4):537–553.
- View Article
- Google Scholar
21. Zhu Y, Li L. Multiple matrix gaussian graphs estimation. Journal of the Royal Statistical Society: Series B (Methodological). 2018;80(5):927–950. pmid:30505211
- View Article
- PubMed/NCBI
- Google Scholar
22. Zhao S, Cai T, Li H. Direct estimation of differential networks. Biometrika. 2014;101(2):253–268. pmid:26023240
- View Article
- PubMed/NCBI
- Google Scholar
23. Yuan H, Xi R, Chen C, Deng M. Differential network analysis via lasso penalized D-trace loss. Biometrika. 2017;104(4):755–770.
- View Article
- Google Scholar
24. Tang Z, Yu Z, Wang C. A fast iterative algorithm for high-dimensional differential network. Computational Statistics. 2020;35(1):95–109.
- View Article
- Google Scholar
25. Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences. 2009;2(1):183–202.
- View Article
- Google Scholar
26. Marlin B, Schmidt M, Murphy K. Group sparse priors for covariance estimation. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence; 2009.
27. Park T, Casella G. The bayesian lasso. Journal of the American Statistical Association. 2008;103(482):681–686.
- View Article
- Google Scholar
28. Andrews DF, Mallows CL. Scale mixtures of normal distributions. Journal of the Royal Statistical Society: Series B (Methodological). 1974;36(1):99–102.
- View Article
- Google Scholar
29. West M. On scale mixtures of normal distributions. Biometrika. 1987;74(3):646–648.
- View Article
- Google Scholar
30. Li Q, Lin N. The Bayesian elastic net. Bayesian Analysis. 2010;5(1):151–170.
- View Article
- Google Scholar
31. Griffin JE, Philip J. Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis. 2010;5(1):171–188.
- View Article
- Google Scholar
32. Carvalho CM, Polson NG, Scott JG. The horseshoe estimator for sparse signals. Biometrika. 2010;97(2):465–480.
- View Article
- Google Scholar
33. Fan J, Feng Y, Wu Y. Wishart distributions for decomposable graphs. The Annals of Applied Statistics. 2009;3(2):521–541. pmid:21643444
- View Article
- PubMed/NCBI
- Google Scholar
34. Demptser A. Covariance selection. Biometrics. 1972; p. 157–175.
- View Article
- Google Scholar
35. Mazumder R, Hastie T. The graphical lasso: New insights and alternatives. Electronic Journal of Statistics. 2012;6:2125–2149. pmid:25558297
- View Article
- PubMed/NCBI
- Google Scholar
36. Wang J, Jiang B. An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss. Computational Statistics & Data Analysis. 2020;142(2):106812.
- View Article
- Google Scholar
37. Letac G, Massam H. Wishart distributions for decomposable graphs. The Annals of Statistics. 2007;35(3):1278–1323.
- View Article
- Google Scholar
38. Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure. 1975;405(2):442–451. pmid:1180967
- View Article
- PubMed/NCBI
- Google Scholar
39. Banerjee S, Monni S, Wells M. A regularized profile likelihood approach to covariance matrix estimation. Journal of Statistical Planning and Inference. 2013;179:36–59.
- View Article
- Google Scholar
40. Iwendi C, Khan S, Anajemba JH, Mittal M, Alenezi M, Alazab M. The use of ensemble models for multiple class and binary class classification for improving intrusion detection systems. Sensors. 2020;20(9):2559. pmid:32365937
- View Article
- PubMed/NCBI
- Google Scholar
41. Iwendi C, Bashir AK, Peshkar A, Sujatha R, Chatterjee JM, Pasupuleti S, et al. COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in Public Health. 2020;8(357). pmid:32719767
- View Article
- PubMed/NCBI
- Google Scholar
42. Iwendi C, Moqurrab SA, Anjum A, Khan S, Mohan S, Srivastava G. N-sanitization: A semantic privacy-preserving framework for unstructured medical datasets. Computer Communications. 2020;161:160–171.
- View Article
- Google Scholar
43. Box GE. A general distribution theory for a class of likelihood criteria. Biometrika. 1949;36(3/4):317–346. pmid:15402070
- View Article
- PubMed/NCBI
- Google Scholar
44. Bland RP, Owen DB. A note on singular normal distributions. Annals of the Institute of Statistical Mathematics. 1966;18(1):113–116.
- View Article
- Google Scholar

[ref1] 1. Shojaie A. Differential network analysis: A statistical perspective. Wiley Interdisciplinary Reviews: Computational Statistics. 2020; p. e1508.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. MIT press; 2009.

[ref3] 3. Chuang H, Lee E, Liu Y, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Molecular Systems Biology. 2007;3(1):140. pmid:17940530
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref4] 4. Taylor I, Linding R, Warder-Farley D, Liu Y, Pesquita C, Faria D, et al. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nature Biotechnology. 2009;27(2):199–204. pmid:19182785
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref5] 5. Li Q, Shao J. Sparse quadratic discriminant analysis for high dimensional data. Statistica Sinica. 2015;25:457–473.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Jiang B, Wang X, Leng C. A direct approach for sparse quadratic discriminant analysis. The Journal of Machine Learning Research. 2018;19(1):1098–1134.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref7] 7. Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the lasso. The Annals of Statistics. 2006;34(3):1436–1462.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref8] 8. Lauritzen S. Graphical models. Oxford: Clarendon Press; 1996.

[ref9] 9. Yuan M, Lin Y. Model selection and estimation in the Gaussian graphical model. Biometrika. 2007;94(1):19–35.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9(3):432–441. pmid:18079126
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref11] 11. Banerjee O, Ghaoui LE, d’Aspremont A. Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data. The Journal of Machine Learning Research. 2008;9:485–516.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref12] 12. Nesterov Y. Smooth minimization of non-smooth functions. Mathematical Programming. 2005;103(1):127–152.
View Article
Google Scholar

[34] View Article

[35] Google Scholar

[ref13] 13. Cai T, Liu W, Luo X. A constrained ℓ1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association. 2011;106(494):594–607.
View Article
Google Scholar

[37] View Article

[38] Google Scholar

[ref14] 14. Guo J, Levina E, Michailidis G, Zhu J. Joint estimation of multiple graphical models. Biometrika. 2011;98(1):1–15. pmid:23049124
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref15] 15. Danaher P, Wang P, Witten D. The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society: Series B (Methodological). 2014;76(2):373–397. pmid:24817823
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref16] 16. Wang H. Bayesian graphical lasso models and efficient posterior computation. Bayesian Analysis. 2012;7(4):867–886.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref17] 17. Banerjee S, Ghosal S. Bayesian structure learning in graphical models. Journal of Multivariate Analysis. 2015;136:147–162.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref18] 18. Peterson C, Stingo F, Vannucci M. Bayesian inference of multiple Gaussian graphical models. Journal of the American Statistical Association. 2015;110(509):159–174. pmid:26078481
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref19] 19. Williams D, Piironen J, Vehtari A, Rast P. Bayesian estimation of Gaussian graphical models with predictive covariance selection. arXiv preprint arXiv:180105725. 2018;.

[ref20] 20. Chiquet J, Grandvalet Y, Ambroise C. Inferring multiple graphical structures. Statistics and Computing. 2011;21(4):537–553.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref21] 21. Zhu Y, Li L. Multiple matrix gaussian graphs estimation. Journal of the Royal Statistical Society: Series B (Methodological). 2018;80(5):927–950. pmid:30505211
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref22] 22. Zhao S, Cai T, Li H. Direct estimation of differential networks. Biometrika. 2014;101(2):253–268. pmid:26023240
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref23] 23. Yuan H, Xi R, Chen C, Deng M. Differential network analysis via lasso penalized D-trace loss. Biometrika. 2017;104(4):755–770.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref24] 24. Tang Z, Yu Z, Wang C. A fast iterative algorithm for high-dimensional differential network. Computational Statistics. 2020;35(1):95–109.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref25] 25. Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences. 2009;2(1):183–202.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref26] 26. Marlin B, Schmidt M, Murphy K. Group sparse priors for covariance estimation. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence; 2009.

[ref27] 27. Park T, Casella G. The bayesian lasso. Journal of the American Statistical Association. 2008;103(482):681–686.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

[ref28] 28. Andrews DF, Mallows CL. Scale mixtures of normal distributions. Journal of the Royal Statistical Society: Series B (Methodological). 1974;36(1):99–102.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref29] 29. West M. On scale mixtures of normal distributions. Biometrika. 1987;74(3):646–648.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref30] 30. Li Q, Lin N. The Bayesian elastic net. Bayesian Analysis. 2010;5(1):151–170.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref31] 31. Griffin JE, Philip J. Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis. 2010;5(1):171–188.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref32] 32. Carvalho CM, Polson NG, Scott JG. The horseshoe estimator for sparse signals. Biometrika. 2010;97(2):465–480.
View Article
Google Scholar

[95] View Article

[96] Google Scholar

[ref33] 33. Fan J, Feng Y, Wu Y. Wishart distributions for decomposable graphs. The Annals of Applied Statistics. 2009;3(2):521–541. pmid:21643444
View Article
PubMed/NCBI
Google Scholar

[98] View Article

[99] PubMed/NCBI

[100] Google Scholar

[ref34] 34. Demptser A. Covariance selection. Biometrics. 1972; p. 157–175.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref35] 35. Mazumder R, Hastie T. The graphical lasso: New insights and alternatives. Electronic Journal of Statistics. 2012;6:2125–2149. pmid:25558297
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref36] 36. Wang J, Jiang B. An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss. Computational Statistics & Data Analysis. 2020;142(2):106812.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref37] 37. Letac G, Massam H. Wishart distributions for decomposable graphs. The Annals of Statistics. 2007;35(3):1278–1323.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref38] 38. Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure. 1975;405(2):442–451. pmid:1180967
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref39] 39. Banerjee S, Monni S, Wells M. A regularized profile likelihood approach to covariance matrix estimation. Journal of Statistical Planning and Inference. 2013;179:36–59.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref40] 40. Iwendi C, Khan S, Anajemba JH, Mittal M, Alenezi M, Alazab M. The use of ensemble models for multiple class and binary class classification for improving intrusion detection systems. Sensors. 2020;20(9):2559. pmid:32365937
View Article
PubMed/NCBI
Google Scholar

[122] View Article

[123] PubMed/NCBI

[124] Google Scholar

[ref41] 41. Iwendi C, Bashir AK, Peshkar A, Sujatha R, Chatterjee JM, Pasupuleti S, et al. COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in Public Health. 2020;8(357). pmid:32719767
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref42] 42. Iwendi C, Moqurrab SA, Anjum A, Khan S, Mohan S, Srivastava G. N-sanitization: A semantic privacy-preserving framework for unstructured medical datasets. Computer Communications. 2020;161:160–171.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

[ref43] 43. Box GE. A general distribution theory for a class of likelihood criteria. Biometrika. 1949;36(3/4):317–346. pmid:15402070
View Article
PubMed/NCBI
Google Scholar

[133] View Article

[134] PubMed/NCBI

[135] Google Scholar

[ref44] 44. Bland RP, Owen DB. A note on singular normal distributions. Annals of the Institute of Statistical Mathematics. 1966;18(1):113–116.
View Article
Google Scholar

[137] View Article

[138] Google Scholar

Figures

Abstract

Introduction

Background

The Bayesian DN

The Bayesian graphical lasso prior

Hierarchical representation

BAGLASSO

Technicalities on conditional dependencies

Synthetic data analysis

Real data analysis

Spam data

South African COVID-19 data

Discussion

Supporting information

S1 File. Supplementary material.

Acknowledgments

References