Fig 1.
BC: Breast Cancer, PPI: protein–protein interaction.
Table 1.
Construction of PPI network based on BC related genes obtained from DISEASES database.
Fig 2.
PPI network of BC-related genes and centrality analysis.
(A) PPI network constructed with 920 nodes from the STRING database, derived from 1,000 query genes. Medium confidence edges (minimum required interaction score = 0.4) were applied. Active interaction sources were experiments, databases, and co-expression. (B) Results of centrality analysis. The upper panel shows scores for the local-based “Degree” method; the lower panel shows scores for the global-based “Betweenness” method. Deeper red node colors indicate higher centrality scores. Nodes with a degree of zero are not displayed. BC: Breast Cancer, PPI: protein–protein interaction.
Fig 3.
Proportion of BC biomarkers included in the top 5% of genes from each centrality ranking.
(A) Centrality analysis results for PPI networks constructed with medium (light blue bar), high (blue bar), and highest (orange bar) confidence edges. Each bar represents the average proportion of biomarkers from networks built with 1,000, 1,500, and 2,000 query genes. Error bars indicate the standard deviation. (B) Centrality analysis results for PPI networks constructed with nodes from 1,000 (light blue bar), 1,500 (blue bar), and 2,000 (orange bar) query genes. Each bar represents the average proportion of biomarkers across all three confidence levels. Error bars indicate the standard deviation. BC: Breast Cancer, PPI: protein–protein interaction.
Fig 4.
Proportion of BC biomarkers in the top 5% of MCC rankings.
The results of MCC analysis are shown for PPI networks constructed from a range of 1,000 to 2,000 query genes with the top disease scores. Edge conditions were medium (light blue bar), high (blue bar), and highest (orange bar) confidence. BC: Breast Cancer, PPI: protein–protein interaction.
Fig 5.
Spearman’s rank correlation coefficient and hierarchical clustering of centrality rankings for BC biomarkers.
The analysis used 62 biomarkers with a degree of ≥1 from the PPI network based on 1,500 query genes. Two biomarkers with a degree of 0 (HOXB13 and SLC39A) were excluded. BC: Breast Cancer, PPI: protein–protein interaction.
Table 2.
Top 5% genes of MCC ranking for PPI network based on 1,500 query genesa.
Table 3.
GO term on top 5% of MCC-ranked genes.
Fig 6.
REVIGO clustering of GO terms from top 5% MCC-ranked genes.
Semantically similar GO terms are positioned closely in the two-dimensional space. The number inside each bubble corresponds to the GO term list in Table 3. Bubble color indicates the log p-value, and the size indicates the term’s frequency in the underlying EBI GOA database. GO: gene ontology.
Fig 7.
Association of BC biomarker MCC ranking with biological features.
(A) Comparison of the evolutionary conservation score between high- and low-MCC ranking groups. The 64 BC biomarkers were divided into two groups, each comprising 32 genes. (B) Comparison of the node score for complex prediction between high- and low-MCC ranking groups. The 62 connected BC biomarkers were divided into two groups, each comprising 31 genes. HOXB13 and SLC39A were excluded from the protein complex analysis because their degree of 0 prevented the calculation of the node score.
Fig 8.
Prediction of key genes as BC biomarker candidates.
(A) Venn diagram illustrating the filtering of the top 5% MCC-ranked genes based on survival and differential gene expression analyses. Among the top 68 genes in the MCC ranking (Table 2), the 45 genes that are not known biomarkers were analyzed. For survival analysis, patients were split by median gene expression, and genes with significance in both RFS and OS (P < 0.05) were selected. For differential gene expression analysis, genes with significant expression differences (P < 0.05) in both normal vs. tumor and tumor vs. metastatic tissues were selected. (B) MCC rank of the 11 identified key genes and the number of interactions with 128 known BC biomarkers. The number of direct interactions for each key gene was counted in the STRING database under medium confidence using “Experiments,” “Databases,” and “Co- expression” as sources. BC: Breast Cancer.