Bayesian Correlation Analysis for Sequence Count Data

doi:10.1371/journal.pone.0163595

Fig 1.

Concepts in the precision of high-throughput sequencing count data.

(A) Deeper sequencing is analogous to measuring objects using a ruler wither finer gradations—all objects are measured with greater absolute precision. However, regardless of sequencing depth, lower count entities (e.g. low expression genes or miRNAs) are measured with less relative precision than higher count entities, similar to the red and blue objects respectively. (B) The Beta distribution can represent our belief over the true, unknown expression levels of gene or miRNAs as a fraction of total expression (for example).

More »

Expand

Fig 2.

Analysis of different priors for Bayesian correlation analysis, on the reduced Wang dataset.

(A) Scatter plot comparing traditional Pearson correlations with Bayesian correlations using a uniform prior for expression levels. (B) Average absolute correlation coefficient for different genes, with increasing expression along the x-axis. (C-D) Similar to panels A-B, but with the Dirichlet-inspired prior. (E-F) Similar to A-B, but with the zero count-inspired prior. (G) Histogram of all pairwise correlation coefficients (self-correlations omitted), for Pearson and all three Bayesian correlation analyses.

More »

Expand

Fig 3.

Comparison of Bayesian versus Pearson correlation analysis for first (left), second (middle) and third Bayesian priors (right) in erythropoeisis and miRNA datasets.

(A-C) Bayesian versus Pearson correlations for erythropoiesis, using first, second and third priors respectively. (D-F) Ratios of mean Bayesian to Pearson correlations for erythropoiesis as a function of increasing gene expression. (G-I) Bayesian versus Pearson correlations for the miRNA dataset. (J-L) Ratios of mean Bayesian to Pearson correlations.

More »

Expand

Fig 4.

Clustering using Bayesian correlation as a similarity measure.

(A-B) Heatmaps for Bayesian (prior 3) and Pearson correlations between genes in the Wang dataset, using the same row- and column-orderings for both. (C-D) Hierarchical clustering of genes by one minus the Bayesian or Pearson correlation respectively. (E) ROC curve for genes with tissue-specific GO terms, ordered by decreasing maximum correlation to any other gene.

More »

Expand

Fig 5.

Bayesian versus Pearson correlations with increasing numbers of simulated replicates of the Erythropoiesis data, both of which seem largely insensitive to the number of replicates.

More »

Expand

Fig 6.

Spearman correlations compared to Pearson and Bayesian (third prior) correlations, on each of the three datasets.

More »

Expand