Sex-specific and opposite modulatory aspects revealed by PPI network and pathway analysis of ischemic stroke in humans

Background Ischemic Stroke (IS) is a major disease which greatly threatens human health. Recent studies showed sex-specific outcomes and mechanisms of cerebral ischemic stroke. This study aimed to identify the key changes of gene expression between male and female IS in humans. Methods Gene expression dataset GSE22255, including peripheral blood samples, was downloaded from the Gene Expression Omnibus (GEO) dataset. Differentially Expressed Genes (DEGs) with a LogFC>1, and a P-value <0.05 were screened by BioConductor R package and grouped in female, male and overlap DEGs for further bioinformatic analysis. Gene Ontology (GO) functional annotation, Protein-Protein Interaction (PPI) network, “Molecular Complex Detection” (MCODE) modules, CytoNCA (cytoscape network centrality analysis) essential genes and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway interrelation analysis were performed. Results In a total of 54,665 genes, 185 (73 ups and 112 downs) DEGs in the female dataset, 461 DEGs (297 ups and 164 downs) in the male dataset, within which 118 DEGs overlapped (7 similar changes in female and male, 111 opposite changes in female and male) were obtained from the GSE22255 dataset. Female, male and overlapping DEGs enriched for similar cellular components and molecular function. Male DEGs enriched for divergent biological processes from female and overlapping DEGs. Sex-specific and overlapping DEGs were put into the PPI network. Overlapping genes such as IL6, presented opposite changes and were mainly involved in cytokine-cytokine receptor interactions, the TNF-signalling pathway, etc. Conclusion The analysis of sex-specific DEGs from GEO human blood samples showed that not only specific but also opposite DEG alterations in the female and male stroke genome wide dataset. The results provided an overview of sex-specific mechanisms, which might provide insight into stroke and its biomarkers and lead to sex-specific prognosis and treatment strategies in future clinical practice.


Conclusion
The analysis of sex-specific DEGs from GEO human blood samples showed that not only specific but also opposite DEG alterations in the female and male stroke genome wide dataset. The results provided an overview of sex-specific mechanisms, which might provide insight into stroke and its biomarkers and lead to sex-specific prognosis and treatment strategies in future clinical practice.

1.Introduction
Stroke is the leading cause of disability and the second leading cause of death worldwide, with an annual incidence of approximately 17 million. It is estimated that every 40 seconds, there is a stroke in the United States, 87% of which are ischemic strokes [1]. Currently, urgent-in-time thrombosis treatment is an effective therapeutic strategy and requires long-term antiplanet or anticoagulant medication to prevent IS recurrence [2].
Sex differences were seen for IS as females had higher incidence and longer life expectancy but worse functional outcomes [3,4], which might be due to different risk factors [5,6,7], anatomic structural Willis incompletion or white integrity [8,9], biologically inherent sex chromosome complemented with gonadal hormones [10,11,12], socialized reasons of in-hospital care [6,13], and therefore, pathology and treatment [14,15]. However, animal-based research has observed decreased infarct size and improved outcomes in female compared to males [16]. Otherwise, a parallel characterization of the cytokine and chemokine response to stroke in the human and mouse brain at different stages of infarct resolution have been reported [17]. Many studies have observed sex differences in IS; however, the mechanism has yet to be elucidated.
High-throughput platforms for analysis of gene expression, such as microarrays, are promising tools for inferring biological relevancy, especially the complex network during the process of IS. However, there is no genome-wide exploration of the sex-specific mechanism in IS. In this study, we investigated the well-documented original data from the Gene Expression Omnibus (GEO) [18]. The human stroke blood sample dataset, GSE22255, was selected for further exploration [19]. Gene ontology and biological function annotation were performed followed by PPI network and related analysis. By using the bioinformatic method, new insight into the mechanisms of IS can be obtained which may reveal potential biomarker candidates for clinical use and drug target discovery.

Microarray data
The authors declare that all supporting data are available within the article [and its online supplementary files]. The GEO dataset mining results showed that few homogeneous studies available (S1 Fig). So, for the study object of translation of clinical practice, we chose human whole blood cell datasets for this comparative bioinformatic analysis work. The gene expression profiles of GSE22255 as whole blood cell datasets for human cerebral ischemia were downloaded from the GEO [20]. GSE22255 was performed on GPL570, [HG-U133 Plus 2] Aymetrix Human Genome U133 Plus2.0 Array. The GSE22255 data set contained 40 samples, including 10 female IS patients, 10 male IS patients, 10 female controls, and 10 male controls [19]. The imported CEL files were subjected to background correction, normalization, and summarization using the robust multichip average (RMA) algorithm for normalization [19].
The GSE22255 study was approved by the ethics committees of the participating institutions. All participants were informed of the study and provided informed consent [19]. For the analysis of this study, no ethics approval and patients' informed consent was needed.

Identification of DEGs
The analysis was carried out by open source software which significantly simplified the development and distribution of preprocessing methods for gene expression microarrays with Bio-Conductor R and package (https://www.r-project.org/) [21,22]. The expression of the GSE22255 DataSet full SOFT file was downloaded (https://www.ncbi.nlm.nih.gov/sites/ GDSbrowser?acc=GDS4521). Female and male IS patients and controls were assigned according to the annotation of the GSE22255. DEGs were identified after probe signal average and normalized, where the log 2 of a total of 54,665 genes were calculated. The genes with LogFC>1 and a P-value <0.05 were considered differentially expressed. Sex-specific DEGs were obtained by comparison of female IS patients with female controls, and male IS patients with male controls, respectively.

Gene ontology and pathway enrichment analysis of DEGs.
Gene Ontology (GO) [23] cellular component, molecular function, biological process and KEGG pathway enrichment were analysed using a web-based tool, Search Tool for the Retrieval of Interacting Genes (STRING) (version 10.5) (https://string-db.org/) [24]. Enrichment tests were based on the hypergeometric distribution [25]. False discovery rate was used to evaluate significance [26].

Integration of Protein-Protein Interaction (PPI) network analysis.
STRING was used to evaluate the interactive (PPI) relationships between DEGs. Evidence-based interactions including text mining, experimental and database interactions achieving combined score >0.4 were selected as significant [24]. PPI networks were constructed using the Cytoscape software [27].
A Cytoscape plug-in, MCODE [28], was used to screen the modules of PPI network identified. Modules inferred using the default settings with degree cutoff at 2, node score cutoff at 0.2, K core at 2, and a maximum depth of 100.
A Cytoscape plug-in, CytoNCA [29], which integrated calculation, evaluation, and visualization analysis for multiple centrality, was proposed to screen essential DEGs. PPI network topological structure and relationship characteristics including: Betweenness Centrality (BC), Closeness Centrality (CC), Degree Centrality (DC), Eigenvector Centrality (EC), Local Average Connectivity-based Centrality (LAC), Network Centrality (NC), Subgraph Centrality (SC), and Information Centrality (IC) were calculated. The appropriate minimum threshold varied and was determined by the network nodes and edges distributions. Top 10 essential DEGs were screened.

Pathways enrichment and interrelation analysis.
KEGG pathways enrichment and interrelation Analysis. Pathway enrichment analysis was carried out using DAVID (The Database for Annotation, Visualization and Integrated Discovery) databse [30]. Group P-values <0.01, minimum gene in clusters was set at 4. Enriched KEGG pathway and overlapped DEGs interrelation analysis for female and male IS patients was conducted and the interrelation network was reconstructed in Cytoscape software.

Results
The bioinformatic analysis results were following the route showed in S2 Fig.

Identification of DEGs
The GSE22255 data set contained 40 peripheral blood samples, including 10 female IS patients, 10 male IS patients, 10 female controls, and 10 male controls. IS patients were required to have suffered only one stroke episode, at least 6 months before the blood collection, and controls could not have a family history of stroke. Participants with severe anemia or active allergies were also excluded [19]. Sex-specific DEGs were obtained by comparison of female IS patients with female controls, and male IS patients with male controls, respectively. All sex-specific DEGs are presented in Fig 1. In the female dataset, 185 (73 upregulated and 112 downregulated) DEGs were obtained, while 461 DEGs (297 upregulated and 164 downregulated), within which 118 DEGs overlapped (7 similar changes and 111 opposite changes in female and male), were obtained in the male dataset.
3.2.1 GO and pathway enrichment analysis. Top 5 or less enrichment analyses results were shown for each part of the GO analysis. The cellular component, molecular function and biological process enrichment analyses results are shown in Table 1

PPI network and module screening.
Based on the information in the STRING database, female- (Fig 2A), male-specific ( Fig 3A) and overlapping ( Fig 4A) DEGs were put into PPI networks. The PPI networks were analyzed using the Cytoscape plug-in, MCODE. Significant modules were clustered (Fig 2B, Fig 3B, Fig 4B, Table 2). The topologically essential DEGs screened by CytoNCA were outlined by thicker border circles (Fig 2A, Fig 3A, Fig 4A). The top 10 significant fold change in overlapping DEGs were shown in parallel with PPI ( Fig  2C, Fig 3C, Fig 4C).

KEGG pathway enrichment and interrelation analysis.
For the female-and malespecific DEGs that presented opposite modulatory mechanisms in overlapping DEGs, the pathway interrelations were used to investigate the bi-modulatory (female-down-male-up) DEGs as shown in Table 3. These genes are mainly involved in pathways including NOD-like receptor signalling pathways, cytokine-cytokine receptor interactions, TNF-signalling pathway, Rheumatoid arthritis, and NF-kappa B-signalling pathway etc (Fig 5). The detailed interactions were included in the S3 Fig_html. Interleukin-1B (IL1B) and Interleukin-6 (IL6) took part in pathways which are the NOD-like receptor signalling pathways, cytokine-cytokine receptor interactions, TNF-signalling pathway, and Rheumatoid arthritis pathways; C-X-C

Discussion
Genome-wide analysis (GWAS) has exposed significant aspects of ischemic stroke [31]. Some important SNP polymorphisms were discovered associated with IS including 5 C-reactive protein (CRP) SNPs [32], or (rs9943582, -154G/A) in the 5' flanking region of (APLNR) was shown to be significantly associated with stroke in the Japanese population. In the previous study, 16 genes including TTC7B significantly changed in the IS group compared with ageand gender-matched controls [19]. On the hypothesis that female and male IS had different mechanisms, female and male IS were compared to healthy controls. As the results showed, 185 (73 upregulated and 112 downregulated) DEGs were obtained in the female dataset, 461 DEGs (297 upregulated and 164 downregulated) in the male dataset. In sex-specific DEGs, there were a group of overlapping DEGs but with opposite changes, which lead to a comparatively smaller amount of DEGs in the overall group. The IS blood sample also presented a smaller amount of DEGs compared with early or advanced atherosclerotic plaque samples [33]. The probable reason is that IS is a less stable pathological state and lacks biological markers compared with atherosclerotic plaque. Males double the DEGs and a larger portion of upregulated DEGs than females in this study, which indicated an underlying recovery mechanism for male IS protection and better outcomes than females [3]. After the identification of sex-specific DEGs in IS, the DEGs were put into multi-step bioinformatic functional annotations. The functionally annotated DEGs including 126 (68.1%) female-specific, 250 (54.2%) male-specific, and 75 (63.6%) overlapping DEGs were put for further GO, KEGG and PPI analysis. Although the functional annotation category did not cover all the DEGs, this  Sex specific signaling analysis of ischemic stroke information could still provide a whole overview and therapeutic drug strategy for sex-specific and opposite modulatory mechanisms in IS patients. The cross-talks between the vascular and immune system play a critical role in both female and male strokes. Female, male, and overlapping DEGs enriched in the functional categories of chemokine receptor binding (GO.0042379) and cytokine activity (GO.0005125) ( Table 1). In this study, IL6 not only obtained key centrality but also gained high centrality in female, male and overlapping PPI networks (Figs 2, 3 and 4). KEGG pathway enrichment and interrelation analysis of female and male overlapping DEGs showed the involved interrelation between the pathways and bi-modulatory (female-down-male-up) DEGs (Table 3, Fig 5). These genes were mainly involved in pathways including NOD-like receptor signalling pathways, cytokine-cytokine receptor interactions, TNF-signalling pathway, Rheumatoid arthritis, and the NF-kappa B-signalling pathway. IL1B took part in 5 pathways, and IL6 took part in 4 pathways which are NOD-like receptor signalling pathways, cytokine-cytokine receptor interactions, TNF-signalling pathway, and Rheumatoid arthritis pathways; CXCL1, CXCL2, and CXCL3 took part in 2-3 pathways including the TNF-signalling pathway, cytokine-cytokine receptor interactions, and NOD-like receptor signalling pathways. Several studies have shown that immune responses including IL6 [34], IgA [35], CXCL16 [36], TNFSF4 [37], and IL10 [38] were involved in stroke. IL6 and infarction size were reported as independent predictors of short-term stroke outcome in young Egyptian adults [34]. For the opposite changes of these DEGs, females presented downregulated tendencies for these inflammatory and immune process-related pathways, while males presented the opposite (Fig 3). The divergent immunological alterations were also observed in mice MCAO explorations, where males had a greater percentage of activated macrophages/microglia in the brain than females, as well as increased expression of VLA-4 adhesion molecules in both the brain and spleen [39]. However, microglia (MG) from female mice had higher expression of IL-4 and IL-10 receptors and increased production of IL-4, especially after treatment with IL-10 (+) B-cells, which indicated that females had heightened sensitivity of MG to IL-4 and IL-10 as direct B-cell/MG interactions promote M2-MG [40]. The incongruent results of immune cell participation were related to stroke stage and age. In our study, the increased DEGs, including interleukins and CXCLs, indicated chronic immune activation of cytokine pathways in males might lead to a protective, instead of damaging, process of neuron regeneration. Another aspect of female and male IS mechanisms would be cross-talk of inflammation and immune processes between apoptosis pathways. For GO biological process enrichment  (Table 1), the male-specific DEGs were involved in regulation of the apoptotic process (GO.0042981), regulation of cell death (GO.0010941), positive regulation of cell death (GO.0010942), and positive regulation of programmed cell death (GO.0043068). NF-kappa Bsignalling pathway including BCL2A1, CCL4, CXCL2, ICAM1, IL1B, PTGS2, were significantly enriched in male IS patients (Fig 5). Male IS presented upregulated BCL2A1 and CXCL2, while female IS presented downregulated BCL2A1 in this study (Fig 4). The anti-apoptosis protein BCL2 could provide neuron protectiveness via different pathways [41]. Therefore, the apoptosis modulation between females and males would be a cause for different outcomes between them. The mitochondrion is a key factor both in acute and chronic stroke damage recovery [42]. However, since our dataset is based on human blood samples, there would be insufficiency on this platform to screen metabolic processes which mainly occur in the cytoplasm and mitochondrion. The sample size is also another restriction of this study, even though we hope the significant results could raise more attention, so the sex specific mechanism could lead to the modifications of the current clinical practice.

Conclusion
The analysis of sex-specific DEGs from GEO human blood samples showed not only specific but also opposite DEG alterations in females and males such as IL6, in the stroke genome-wide dataset. The inflammatory immune process and anti-apoptosis pathways presented divergent sex-specific alterations in IS. The results provided an overview of sex-specific mechanisms, which delivered further insight into stroke and potential biomarkers that can lead to sex-specific prognosis and treatment strategies in future clinical practice. However, detailed experimental validation through a combination of in vitro and in vivo testing is still required.