Fig 1.
Evaluating the visualization properties of linear, log, and MAD plots.
(A) Table summarizing visualization properties for each of the plot types. (B) Visualizing a dataset of positive fold changes with a linear (upper) and log2 (lower) scale to illustrate dynamic range (each datapoint is a staggered line, alternating between black and another color to aid with datapoint identification). (C-E) A dataset of fold changes ranging from 1/6 to 6 (equal to -5 to 5 in fold change units, see legend on right) is visualized with (C) linear, (D) log, and (E) MAD plots to illustrate their characteristics. Visualization properties of plot types are assessed with: readability based on the transform and the units of the axis tick labels, proportionality from linear fits between the point of no change and extreme datapoints (grey lines), symmetry from boxes that match points of same magnitude (grey shaded boxes to right of each plot), and dynamic range based on whether datapoints are mapped to linear space (medium) or log space (high). (D-E) Inner y-axis tick labels are raw transform units; outer y-axis tick labels are back-transformed to original fold change units.
Fig 2.
Illustration of MAD-FC transform and plot.
(A) Table of fold change datapoints that pair negative fold changes (–) with their corresponding positive fold changes (+), along with a fold change of 1 denoting the point of no change (NC). (B) Plot of fold change datapoints in a linear scale, with negative fold changes compressed between [0, 1] (datapoints from FC column). (C) Fold change values with a mirror transform applied (fM) to the negative fold changes to stretch their position to match the corresponding positive fold changes (grey rectangle denotes undefined region between [–1,1], datapoints from MFC column). (D) A contraction transform (fC) pulls both positive and negative fold changes 1 unit closer to zero, eliminating the undefined region, but leaving fold change labels shifted 1 unit from their original value (datapoints from MAD-FC column). (E) The transforms in (C) and (D) are reversed on the axis tick labels to annotate the datapoints with their actual fold change value. Plot (D) represents a raw MAD plot while the transform reversal in step (E) represents a labeled MAD plot. A labeled MAD plot can be annotated with axis tick marks formatted as fractions, exponents, or decimals. A labeled MAD plot has datapoints identical in value to the original fold change measurements, but they are spatially distorted to achieve symmetry and proportionality. Column definitions for (A): |FCU|, absolute fold change units; Direction, whether fold change datapoint is negative/decreasing (–) or positive/increasing (+), or no change (NC); FC: fold change value; MFC, fold change value after mirror transform; MAD-FC, fold change value after mirror and contraction transform.
Fig 3.
Comparison of log, linear, and MAD fold change plots for RNA-Seq data.
Volcano plots of p-value versus (A) linear, (B) log, and (C) MAD fold change. MA plots using (E) log2, (F) linear, and (G) MAD fold change versus normalized mean count. Datapoints are annotated as significantly upregulated (Up, red), significant downregulated (Down, blue), or not statistically significant (NS, black) based on a Wald test adjusted p-value < 0.1 and a fold change greater than ± 1.
Fig 4.
Comparison between fold change plots with interval estimates.
Fold change interval estimates with the same interval width across groups visualized with a (A) linear, (B) log2, and (C) MAD plot (from a simulated dataset with identical dispersion in fold change units across all groups with interval estimates spanning from -2 to +2-fold change units from the point estimate, error bars are confidence intervals, color gradient used to visually separate groups). Comparison of (D) linear, (F) log2, and (F) MAD fold change plot of protein expression and phosphorylation in HCC1954-P cells treated with either 300nM refametinib (MEKi) or 15nM copanlisib (PI3Ki) alone or in combination (MEKi– 300nM: PI3Ki– 15nM) (error bars are standard deviation).
Fig 5.
Comparison between fold change box plots.
Fold change boxplots of simulated data visualized with a (A) linear, (B) log2, and (C) MAD plot (from a simulated dataset with identical dispersion in fold change units across all groups, 2-fold change unit differences between each quartile boundary, color gradient used to visually separate groups). Comparison of (D) linear, © log2, and (F) MAD plots of mRNA expression of various genes measured from patients with breast invasive carcinoma (BRCA), ovarian serous cystadenocarcinoma (OV) and lung squamous cell carcinoma (LUSC) (datasets from the Cancer Genome Atlas).
Fig 6.
Comparison between fold change violin plots.
Fold change violin plots with the same dispersion of measurements across groups visualized with a (A) linear, (B) log2, and (C) MAD plot (from a simulated dataset with identical dispersion in fold change units across all groups, color gradient used to visually separate groups). Comparison of violin plots with (D) log, (F) linear, and (F) MAD fold change of mRNA expression of various genes measured from patients with breast invasive carcinoma (BRCA), ovarian serous cystadenocarcinoma (OV), and lung squamous cell carcinoma (LUSC) (data from the Cancer Genome Atlas).
Fig 7.
Comparison of heatmaps with different encodings between fold change and color.
Comparison of heatmaps with a (A) linear, (B) log2, and (C) MAD-FC transformed color mapping of differential expression of proteins that interact with ubiquitin, a regulatory protein found in all eukaryotic organisms. The rows are proteins that are identified as Ubiquitin-protein interactors while the columns are experiment groups that represent ubiquitins of specific chain lengths (linear mono (Ubi1), tetra (Ubi4), and hexa-ubiquitin (Ubi6)). Experiments were performed on HeLa cells in vitro and expression for each group are averaged across three replicates. Note: Color scale mapped to fold changes after each of the transforms.
Fig 8.
Comparison of volcano plots with a dataset of fold change values with high dynamic range.
Comparison of heatmaps with (A) linear, (B) log2, and (C) MAD fold change color mapping of differential gene expression of papaya leaves tissue after drought stress compared to control. Datapoints are annotated as significantly upregulated (Up, blue), significant downregulated (Down, red), or not statistically significant (NS, black) based on an FDR < 0.05 and a fold change greater than ± 1.
Table 1.
Packages used in code repository.