Genome-wide prediction of topoisomerase IIβ binding by architectural factors and chromatin accessibility
Fig 2
Machine learning schema for the prediction of TOP2B binding.
TOP2B binding sites and random regions were first identified. Then, 15 high-throughput sequencing experiments together with DNA sequence and shape features were scored around such regions, which resulted a data matrix with rows representing TOP2B/random sites and columns representing the scored features. Finally, binary classifiers were trained and tested using 5 fold cross-validation and feature selection was applied to identify the most informative features.