Assessment of complementarity of WGCNA and NERI results for identification of modules associated to schizophrenia spectrum disorders

doi:10.1371/journal.pone.0210431

Fig 1.

Fluxogram with the pipeline for both WGCNA and NERI.

Using the same datasets (BAHN and KATO), both methods utilizes different types of network analysis based on case and control groups, co-expression analysis and Network Medicine’s concepts. WGCNA uses the expression dataset values for the creation of each case and control networks; NERI method integrates different types of data (such as Seeds and PPI databases), which are necessary for create a case and control networks, based on the analysis of all shortest path between all pairs of seed genes in these networks. Our approach, WGCA-NERI consist in the combination of WGCNA’s least preserved module and NERI’s Δ′ score lists. With these results (for both BAHN and KATO dasets), we perfomed an enrichment analysis (using DAVID ver.6.8 and KEGG databases), an MSET analysis, comparing the overlap of these genes with lists of data (CNV, DEG, DMG, exome and GWAS) related o schizophrenia from the SZDB database—and finally, we identified the hub genes from both WGCNA and NERI results and its overlap with the MSET results.

More »

Expand

Table 1.

Clinical Demographic Statistics from post-morten encephalic tissue of prefrontal cortex (BA46A).

The chip hgu133a—Affymetrix Human Genome U133 Set—were used in all three databases. All p-values were calculated according to data distribution X²-test, Student or Mann-Whitney.

More »

Expand

Table 2.

PPI databases adopted for the application of NERI method.

More »

Expand

Fig 2.

Correlation between BAHN and KATO least preserved module (Royalblue and Greenyellow, respectively) kME and sample traits, using LME model.

In both results, Age (a), Gender (b), Disease (c), PMI (d) and brain pH (e). In all subfigures, the kME M value are inserted on y-axis and the sample trait, on x-axis.

More »

Expand

Fig 3.

Boxplots representing NERI robustness overlap distributions: Boxplots representing the distributions of 50 overlaps (corresponding to 50 random seed sets, one for each execution) for the top 50, 100, 150, 200 genes ranked by Δ’ score.

The X axis represents the proportion of removed seeds (10%, 20%, 30%, 40%, which correspond to 3, 6, 9, 12 removed seeds respectively). The Y axis represents the overlaps distributions in percentage.

More »

Expand

Fig 4.

WGCNA—Least preserved modules (A) and NERI—Δ′ SCORE (B) Replication analysis of the intersection obtained from the final results obtained by WGCNA and NERI for the 2 databases: BAHN and KATO.

(A) WGCNA, least preserved module (total N = 12,719 genes); (B) NERI, Δ′ score (total N = 9,554 genes).

More »

Expand

Fig 5.

Final intersection analysis of BAHN and KATO—WGCNA x NERI intersections between BAHN and KATO dabatase on each one of WGCNA and NERI results presented on the venn diagrams on Fig 4.

Results with W_ pattern represents the WGCNA’s least preserved module; N_ represents the genes from NERI‘s Delta ranking.

More »

Expand

Fig 6.

DAVID clusters based on its biological categories of WGCNA, NERI and WGCNA-NERI results from BAHN database.

The network based on DAVID results of WGCNA-NERI, as seem on Table 3, from BAHN database. On its center (small nodes) all identified clusters; on the outer networks, represented by larger nodes, each one of the 11 biological categories, as identified on Table 3.

More »

Expand

Fig 7.

The TOP 20 pathways (ranked by FDR ≤ 0.05) from KEGG database using the BAHN dataset.

In both plots, the intersection between the list of genes obtained by both WGCNA least preserved module (right) and NERI (left). The size of each circles represents the counts (number of genes that are part of each pathway). The pathways are listed on y-axis and on the x-axis, the Gene Ratio of each result.

More »

Expand

Table 3.

The classification of the biological pathways (Gene Ontology) of each cluster obtained on DAVID, accordingly with WGCNA, NERI and WGCNA-NERI results.

DB = database, CA = Cellular Activity, CD = All types of cellular death or apoptosis, CG = Cellular Growth, CR = cellular regulation, CS = cellular signaling, DIF = cellular differentiation, ENZ = protein activity, GEN = genetic related functions, IN = inflammatory pathways, IS = immune system, ME = metabolism, NE = nervous system, EM = embryological activity, CL = Total number of formed Clusters on each analysis, filtered by Bonferroni ≤ 0.05. The complete list of the enriched pathways are available on S5 Table.

More »

Expand

Table 4.

MSET analysis of the intersection of BAHN and KATO results (of NERI, WGCNA and WGCNA-NERI methods) with reference the sets for Copy Number Variant (CNV), Differentially Expressed Genes (DEG), Differentially Methylated Genes (DMG), Exome (EXOM), Genome Wide Association Study (GWAS).

All reference sets provided by SZDB (A database for schizophrenia genetic research).

More »

Expand