RFECS: A Random-Forest Based Algorithm for Enhancer Identification from Chromatin State
Figure 3
Out-of-bag variable importance of histone modifications in enhancer prediction.
The average variable of histone modifications across 5 cross-sections of data in 2 sets of replicates as well as averaged replicates using all 24 modifications in A.)H1 and B.)IMR90 cells. Out-of-bag variable importance was calculated from the random-forest based classification of p300 binding sites against TSS+genomic background. Robust appearance of H3K4me1, H3K4me3 and H3K4me2 among the most important marks across replicates and cell types, indicates these may form a minimal set for prediction of enhancers. Differences observed in correlation clustering of the same 24 modifications in C.)H1 and D.)IMR90 explain some of the differences in ordering of variables in the two cell types. Same non-black colors of modifications indicate clusters that co-occur in both cell-types.