Fig 1.
Integrated three-pipeline analysis.
Flowchart showing from top to bottom: inputs accessed from PsychENCODE, preprocessing steps (SVA [5], PVCA [6], outlier removal), the three analytic branches deployed (igraph/MST centrality, WGCNA modules, PCA/t-SNE) and the criteria used for multi-axis target ranking. Finally, at the bottom of the flowchart we present the insights of the analysis. Sample sizes: controls n = 261, schizophrenia n = 153.
Fig 2.
Normal probability (Q–Q) plots and histograms for gene expression distributions in control (A) and schizophrenia (B) samples.
Fig 3.
Control network: module detection workflow.
(A) MST derived from Spearman correlations (see Methods for exact edge-pruning thresholds and component filtering). (B) Community detection (edge-betweenness and Louvain results shown). (C) Module color assignment used for downstream enrichment. Node colors indicate module membership. See Methods for layout algorithms and thresholds (Large Graph Layout and Fruchterman–Reingold).
Fig 4.
Control network: module visualization.
Each module is represented by a different color.
Fig 5.
Control network: layout refinement and community numbering.
(A) community numbering annotated on representative nodes; (B) scaled layout for visualization using Fruchterman–Reingold.
Fig 6.
Schizophrenia (pathogenic) network analysis.
(A) input network (undirected, null weights); (B) gene clustering results; (C) community detection via edge-betweenness.
Fig 7.
Network analysis in schizophrenia: Different modules are represented by different colors.
Fig 8.
Network analysis in schizophrenia: We scaled the network for better visualization (B) and assigned a different community number to each module for enrichment analysis (A).
Table 1.
Quick mapping of each analytic branch to the main figures.
Table 2.
Gene Ontology functional annotation after Fig 8.
Fig 9.
Scaled centrality visualizations for Control network.
(A) top degree; (B) top eigenvector; (C) top closeness; (D) top betweenness. For each panel the node diameter is proportional to the corresponding centrality score.
Fig 10.
Scaled centrality visualizations for Pathogenic network.
Panels (A–D) correspond to top degree, eigenvector, closeness, and betweenness centrality respectively.
Fig 11.
Distribution histograms of centrality metrics (control).
Notably, in panel (B), the degree distribution approximates a power-law behavior, characterized by the majority of nodes having fewer than 20 connections, while a small number of nodes exhibit high connectivity, with more than 150 connections. This pattern is indicative of a scale-free topology, a hallmark of many biological networks.
Fig 12.
Histograms of network centrality measures in schizophrenia.
Fig 13.
PCA plots with variable factor arrows and grouping variable ellipses.
Fig 14.
(A) Shows the top 1% of genes contributing to the first five principal components (PCs); (B) displays the top 10% of contributing genes.
Fig 15.
Both genes and samples are projected onto the same two-dimensional space. Gene contributions are represented as arrows, where the direction indicates the gene’s influence on the principal components and the length reflects the magnitude of that contribution. The angles between vectors illustrate the degree of correlation among genes (A). The first five principal components collectively explained approximately 14% of the total variance in the dataset (B).
Table 3.
Genes with the highest positive and negative contributions to each of the top five principal components (as identified from the loading plots in Fig 14) were subjected to Reactome pathway gene set enrichment analysis.
Fig 16.
PC selection and t-SNE workflow.
(A) Scree plot of observed principal component variances. (B) Permuted (null) variance plotted together with observed variance; the indicated intersection is the point where the observed variance curve meets the permuted/noise curve and therefore marks the cutoff number of PCs retained for downstream analysis. (C) random forest validation (ROC/AUC with retained PCs). (D) final t-SNE plot using selected PCs.
Fig 17.
Dendrogram and module colors.
Fig 18.
Bar plot of the mean gene significance across all genes in each module.
Fig 19.
Scatterplot of gene significance for diagnosis (y-axis) versus intramodular connectivity (kME) based on the module eigengene (x-axis).
Each point represents a gene within the corresponding color-coded module.
Fig 20.
Dendrogram of genes from the most relevant modules.
Branches represent clusters of highly interconnected genes, with module colors indicating the corresponding co-expression modules selected for further analysis.
Fig 21.
Module relationships and gene-level projections.
(A) heatmap of gene – module eigengene correlations; (B) MDS on TOM dissimilarity with genes color-coded by module; module “fingers” highlight clusters and hub genes at their tips. (C) t-SNE of genes colored by module assignment (validates MDS structure).
Fig 22.
Functional enrichment summary across annotation frameworks.
Panels: (A) KEGG pathway barplot for selected modules; (B) GO molecular function; (C) GO biological process; (D) GO cellular component; (E) Reactome pathways. Notably, the RoyalBlue module is enriched for pathways related to cytosol-to-endoplasmic reticulum transport and antigen processing and presentation of exogenous peptide antigens via MHC class I. The OrangeRed3 module is associated with the regulation of histone H3-K27 methylation.
Table 4.
GO enrichment results.
Table 5.
KEGG pathway enrichment results.
Table 6.
Operational decision framework linking multi-dimensional ranking evidence to experimental validation strategy.