A framework for integrating directed and undirected annotations to build explanatory models of cis-eQTL data
Fig 1
Illustration of BAGEA model components.
(A) The core components of the BAGEA model in the summary statistics formulation. Observed variables are in squares while estimated variables are circled. Given are zj, the eQTL z-scores for gene j, as well as the LD matrix Σj, defining the correlation between summary statistics. Further, z-scores are influenced by the true eQTL effects bj. These effects in turn depend on directed and undirected annotations, Vj and Fj respectively. While undirected annotations can cover regions of any size, directed annotation have the same size as the genomic variants themselves. The impact of annotations on bj is estimated from the data via ω and ν. (B) An example of the modeling of different priors of elements of ω using meta-annotations via υ variable vectors. We assume that directed annotations are available for nine annotations, which were derived from tissues Liver, Blood and Brain via 3 assay types DHS, H3K27ac and H3K4me3. It is reasonable to assume that for a given eQTL study, particular tissues or cell types are more relevant than others. We model this by introducing a variable υ for each tissue (or cell type) that affects the prior distribution of only those elements of ω that are derived from this tissue, e.g. υLiver only affects elements of ω tied to experiments performed in liver. We model different priors for various for assay types analogously. Shown is the resulting network of influences of the variable υtissue, υassay on ω. (We used the actual group names as indices, while in the main text, elements of υ’s and ω are indexed by natural numbers).