Identification of High-Impact cis-Regulatory Mutations Using Transcription Factor Specific Random Forest Models
Fig 7
Candidate cis-regulatory driver SNVs and insertions across 498 breast cancer genomes.
A) All SNVs and insertions with high PRIME score (>0.3) (insertions are within the black box) found by M1 models in the regulatory regions around cancer related genes and 167 TFs expressed in breast cancer (all significant PRIME scores with model-specific thresholds are provided in S5–S6 Tables). Values inside boxes indicate the recurrence, that is the number of samples where this variant was found across the 498 TCGA samples. B) An example of a high scoring recurrent insertion that is predicted to generate a TP53 gain of target in the vicinity of SOX5. Z-scores of the SOX5 gene expression are significantly higher (Wilcoxon rank sum test) in the 33 samples with the insertion, compared to samples without the insertion.