Predicting regional somatic mutation rates using DNA motifs
Fig 5
Analysis of the cancer-related regions.
(a) The identified cancer-related regions in Breast-AdenoCA (red dots); (b)The enriched pathways for the cancer-related regions in Breast-AdenoCA; (c) The fold change and p-value for the motif disruption rates in the 13 tumors. The red line represents the p-value of 0.05; (d) The fold change of chromHMM state (same as that in Fig 3E) for the cancer-related regions in each tumor; (e) The percentages of motif types that were significantly disrupted in cancer-related regions; (f) The classification model performance. The confusion matrix for the classification model using the 150 selected cancer-related regions on the testing dataset. Rows and columns correspond to the true and predicted tumor types, respectively. Values are the number of donors classified correctly. For example, for the Prost−AdenoCA, 31 donors were correctly classified.