Enrichment on steps, not genes, improves inference of differentially expressed pathways
Fig 6
We use the multivariate Fisher’s noncentral hypergeometric distribution to obtain probabilities for outcomes and sum to obtain our probability mass function for the event P(K = k). We then sum over K ≥ k to obtain a p-value. fisher_nchgd_pmf is calculated using BiasedUrn [31]. Sorting and compressing the pathway vectors can greatly reduce the size of . Across our test datasets, the median reduction of the number of calls per dataset to BiasedUrn by sorting and compressing the pathway vectors is 65 fold (the median of
).