Independent component analysis recovers consistent regulatory signals from disparate datasets
Fig 6
Predicting regulons using big data.
(A) Pie chart displaying the number of Regulatory, Functional and Uncharacterized iModulons extracted from the COLOMBOS E. coli compendium. (B) Venn diagram illustrating the number of iModulons shared between the COLOMBOS compendium and the combined dataset discussed in Fig 5. (C) Histogram of the overlap coefficients between the 131 shared iModulons between COLOMBOS and the combined dataset. (D) Scatter plot of the iModulon gene weights for the putative HprR iModulon. Purple genes are in both the iModulon from COLOMBOS and the iModulon from the combined dataset iModulon. Red genes are only in the iModulon from COLOMBOS, and blue genes are only in the iModulon from the combined dataset. The dashed lines indicate iModulon thresholds, and the gray diagonal line is the 45-degree line. (E) Schematic representation of the genes near hprR. (F) Bar chart of the putative HprR iModulon activities from GEO dataset GSE35371. (G) Scatter plot of the iModulon gene weights for an uncharacterized iModulon. Colors are identical to panel (d). (H and I) Relative iModulon activities of the iModulon from panel (G) from GEO datasets GSE21839 and GSE55365, respectively. Each dataset is centered to its own reference condition, so relative activities cannot be compared across bar charts. (J) Scatter plot of the iModulon gene weights for the antibiotic-responsive uncharacterized iModulon. (K and L and M) Bar chart of the antibiotic-responsive iModulon activities from GEO datasets GSE31140, GSE37026, and GSE10158.