Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Table 1.

Pathway module function and cluster enrichment pattern.

More »

Table 1 Expand

Fig 1.

Flow diagram of studies collection and screening process.

An initial selection of 26 SSc studies from both GEO and ArrayExpress data repositories were screened for sample redundancy (to exclude studies sharing the same set of samples), biopsy sites (studies of non-skin samples excluded) and commonly shared genes (2 data sets removed) to arrive at the final 9 SSc studies for the final meta-analysis.

More »

Fig 1 Expand

Fig 2.

Schematic view of data processing and analysis.

(A) 9 SSc studies were used in the current meta-analysis. (B) Preprocessed gene expression matrix of each study was projected into pathway space using GSVA algorithm (C) 9 SSc studies were projected into one pathway enrichment table (D) Multiple filters were applied to remove pathways of constant enrichment scores. (E) Consensus clustering procedure was applied and (F) optimal number of subsets were determined. (G) Machine learning procedure was used to determine best pathway modules at maximum accuracy. (H) Genes were extracted from selected top pathways modules that differentiate SSc subtypes and (I) subsequent network analysis was applied to select important gene regulators behind each subset.

More »

Fig 2 Expand

Fig 3.

The determination of optimal number (K) of detailed SSC subsets.

(A) Heatmaps of consensus matrices for K (number of clusters) ranging from 2 to 9 showing clear-cut consensus clustering pattern at K = 8. (B) Silhouette plot showing average silhouette score peaking at K = 8 when considering more than 3 clusters. (C) Delta area plot showing relative change in the area under the CDF curve with dotted line as reference. It is evident that from K = 8 the change of AUC of CDF becomes very minimal.(D) Consensus cumulative distribution function plot showing at cluster K = 8, the area under CDF curve almost maximized with little improvement when K increased to 9 or 10.

More »

Fig 3 Expand

Fig 4.

The choice of best supervised machine learning algorithms to classify SSc pathway set into 8 clusters / subsets.

(A) Different supervised classification methods were evaluated using all features, top 50 features and top 100 features by recursive feature elimination process. (B) The heatmap of 8 SSc subsets was generated based on top 80 pathways selected by random forest method according to Gini index. The 8 subsets are later defined respectively by the pathways they are enriched in. Subsequently 5 pathway modules were defined by tree-cut with visual assistance so that their average levels of expression can determine the distinct features of the 8 clusters.

More »

Fig 4 Expand

Fig 5.

The Pearson correlation scatter plot of selected gene regulators showing significantly positive association between normalized gene expression and MRSS.

The 9 genes (CCL2, COL1A1, FGFR1, FN1, IGF1, IL6, IRAK2, MMP1, TLR7) shown in the figure were selected from random walk or neighborhood scoring algorithm, and are positively correlated with MRSS based on study GSE58095. These 9 genes are mostly expressed on myeloid / stromal cells and many of them are deeply involved in cell skeleton remodeling, innate immune response and cell-cell adhesion processes.

More »

Fig 5 Expand

Fig 6.

Hub-gene centered subnetwork based on random walk and neighborhood scoring algorithms.

(A) TLR7 centered subnetwork identified by random walk algorithm and MRSS correlations. (B) IRAK2 centered subnetwork identified by neighborhood scoring algorithm and MRSS correlations. (C) FGFR1 centered subnetwork identified by random walk algorithm and MRSS correlations. In all 3 subnetworks, hub gene regulators are highlighted in blue and the edge of inhibiting relationships between two genes are colored in light blue as well.

More »

Fig 6 Expand

Fig 7.

Cell type enrichment of 8 SSc clusters and comparison to control.

(A) Violin plot showing 4 cell types (Endothelial, Fibroblasts, Macrophages and general Myeloid) enrichment among 8 SSc clusters as well as control cohort. Clusters significantly different (p value < = 0.05) from control group were marked with asterisk. (B) Heatmap showing relative enrichment score in the form of -log p value of t-test comparing 4 cell types against control.

More »

Fig 7 Expand