NetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis
Fig 5
(a) Comparison of survival prediction performances according to patients’ mutational burden for LUAD. Three different representations of the mutations are used to perform survival prediction using a ranking SVM: raw (the raw binary mutation data), NSQN (network smoothing with quantile normalisation) and NetNorM. Performances for half of the patients with fewer (resp. more) mutations are derived from the predictions made using the whole dataset. (b) Scatter plot of the total number of mutations in a patient of the LUAD cohort (x-axis) against the number of mutated neighbours of KHDRBS1 in a patient (y-axis). Only patients with less than kmed = 295 mutations are shown, where kmed is the median value of k learned across cross-validation folds. Red (resp. blue) indicate patients mutated (resp. non mutated) in KHDRBS1 after processing with NetNorM using k = kmed. The black line was fit by linear regression and by definition indicates the expected number of mutated neighbours of KHDRBS1 given the mutational burden of a patient.