Fig 1.
Partial least-square discriminant analysis of RNA-seq data.
(A) PLS-DA plot of RNA-seq data showing clear transcriptome-based discrimination among the three publication datasets. Each point represents the transcriptome signature of one sample, with ellipses representing 95% confidence level. (B) Genes with high discriminatory ability were identified from PLS-DA. (C) Histogram showing the distribution of COG categories (Clusters of Orthologous Genes) associated with the genes with the highest discriminatory power in the two components generated by PLS-DA. See S2 Table for the full list of discriminatory genes.
Table 1.
Description of the dataset and samples used in this study.
Fig 2.
Illustration of transcriptomic data mining.
(A) and (C) Visualization of RNAseq results with volcano plots. (B, D, E, F, G, H) Boxplots showing the read counts for genes of interest among different growth conditions.
Fig 3.
Network construction and module detection with WGCNA.
(A) Network topology analysis for various values of soft-thresholding powers, with the scale-free index and the mean connectivity as a function of the soft-thresholding power. (B) Dendrogram of all genes divided into 24 modules, with dissimilarity based on topological overlap, presented with assigned module colors. (C) Dendogram representing the 24 modules identified by WGCNA. The heightcut (red line, heightcut = 0.0) was used to unmerge modules. The grey module represents genes that are not included in any of the other modules. (D) A heatmap depicting the topological overlap matrix (TOM) among all genes in the analysis. The intensity of the red color indicates the strength of the correlation between all pairs of genes.
Fig 4.
Identification of the modules associated with conditions in the three original datasets.
(A) Heatmap depicting the correlation between module eigengenes and the original datasets. Pearson coefficient correlations are indicated. The p-value is indicated in parentheses. (B) Heatmap of gene expression levels in the modules across the samples in the three original datasets.
Table 2.
Size of gene co-expression modules.
Fig 5.
Illustration of enrichment analysis with STRING (version 9.0).
Results are presented for the modules “Lightgreen” (red color: Cobalamin biosynthetic pathway), “Darkred” (blue color: Bacteriophage functions), “Royalblue” (red color: Fatty acid biosynthesis), “Purple” (green color: Transposon-encoded proteins; red color: Molybdenum cofactor biosynthesis; yellow color: Endonuclease/relaxase).
Table 3.
Functional enrichment in each module, analyzed with STRING 9.0.
Fig 6.
Network plot depicting the top connections in the “turquoise” module.
Nodes represent genes, and node size is correlated with the degree of connectivity of the gene.
Table 4.
Identification of hub and bottleneck genes from the WGCNA network.