pyPAGE: A framework for Addressing biases in gene-set enrichment analysis—A case study on Alzheimer’s disease
Fig 2
pyPAGE is a novel framework for inference of differentially regulated gene-sets.
(A) Schematic of the pipeline we propose for the analysis of bulk RNA-seq data using pyPAGE. The pipeline starts with preprocessing of RNA-seq data and then diverges into two branches: one for the analysis of transcriptional regulation and the other for the analysis of post-transcriptional regulation. (B) Precision-recall curves demonstrating the performance of pyPAGE and benchmarking it against iPAGE and fgsea. The analysis was made in 4 simulated scenarios with and without added biases and with or without dual regulation patterns. As a general metric of performance we report PR-AUC score, also cross glyphs mark the performance at p-value threshold equal to 0.01. (C) Graphical representation of pyPAGE’s robustness to variations in input data quality. The analysis incorporates two distinct curves illustrating the effects of: 1) subsampling the data from 5% to 100% in increments of 5%, and 2) adjusting the parameter that dictates the fraction of deregulated genes within each regulon (note that the default value for this parameter is 0.5 which explains divergence of two curves at 1.0).