A framework for integrating directed and undirected annotations to build explanatory models of cis-eQTL data
Fig 4
Directed annotations partially explain gene expression variance in GTEx.
The BAGEA model was fit using various GTEx eQTL data (supplemented with GEAUVADIS eQTL data) and with ExPecto-derived directed annotations on genes in the trainig set (chr1,‥,chr15) with a top nominal p-value<10−7. ExPecto includes 2002 total annotations, of which either 1187 histone and DHS annotations from Roadmap (Roadmap) or 690 non-histone ChIP-Seq from ENCODE (TF) were used. For the Roadmap annotation set we enforced structure on the priors of ω by using the meta-annotations available for cell type and assay type, (group-lasso), while for the (TF annotation set, each ωi parameter be controlled by its individual υi parameter (lasso). For each gene j in the test set (chr16,‥,chr22 and top nominal p-value< 10−7), we calculated an approximate version of Sj, the squared magnitude of the directed predictor , where the approximation uses external LD information. Further, we calculated an approximate version of
, the mean squared error (MSE) when predicting gene expression yj from
. (A) Displayed is the average (approximated)
across all genes for each GTEx experiment, and annotation subset. 95% Confidence intervals are computed by bootstrap sampling. (B) For each GTEx experiment and annotation subset, we sorted results by predictor size Sj and and averaged
within the top quartile. Displayed is the relationship between the MSE of the predictor and its mean squared magnitude Sj. Averaged Sj, top quartile
: The mean value of the directed predictor size Sj in the top quartile on the horizontal axis; Averaged Directed MSE (
): The averaged
of genes falling into the top quartile in terms of Sj on the vertical axis. The 95% confidence interval for each window was derived by bootstrap sampling. We see that the average squared magnitude Sj is of similar size as the gains in directed MSE suggesting that the BAGEA does not substantially overfit.