Fig 1.
An overview of the optimized Cox-nnet neural network architecture used in this study.
Cox-nnet is composed of one hidden layer and an output “Cox-regression” layer. It is optimized to work on high dimensional gene expression data. The model is trained to minimize the partial log likelihood using back-propagation.
Fig 2.
A. Boxplot of the C-IPCW of the 10 TCGA datasets using four prognosis-predicting methods: Cox-nnet (dropout), CoxBoost, Cox-PH (ridge) and RF-S. The data were randomly split into 80% training and 20% testing sets, and repeated 10 times. Average C-IPCWs are presented as the metric. For “overall” condition, all 10 TCGA cancer datasets are combined as one “cancer” dataset. Sign * indicates statistical significance (p < 0.05). B. Heatmap of the performance rank of each dataset, based on the order of the average C-IPCW scores. Ranks 1, 2, 3, and 4 indicate the descending performance of each computational method.
Fig 3.
A. Hidden node activation weighted by the corresponding Cox layer coefficients of the TCGA KIRC dataset. The columns represent individual patient scores, ordered by their Prognostic Index. The rows represent the node activations. B. t-SNE plot of the top 20 nodes (left) and t-SNE of differentially expressed genes between the two groups with low and high prognostic index, respecitively (right). C. Gene Set Enrichment Analysis: significantly enriched KEGG pathways of the top 20 hidden nodes (adjusted p-value < 0.05).
Table 1.
Cox-nnet node-associated pathways.
Significantly enriched pathways from common to all 20 hidden nodes that are not found in the Cox-PH Gene Set Enrichment Analysis (Adjusted P < 0.05).
Fig 4.
Single variable C-IPCW scores of the leading edge genes from Cox-nnet and Cox-PH.
The leading edge genes are obtained using Gene-Set Enrichment Analysis, and they are genes contributing positively to the maximum value of the pathway enrichment score[29]. Cox-nnet has significantly higher C-IPCW scores (p = 1.253e-05).
Fig 5.
Enriched pathway-gene bipartite network from the leading edge genes and significantly enriched pathways.
Significantly enriched pathways common to all 20 hidden nodes are labeled in green. Leading edge genes found uniquely in Cox-nnet are labeled in orange, and genes found in both Cox-nnet and Cox-PH are labeled in blue.