Inherent limitations of probabilistic models for protein-DNA binding specificity
Fig 2
The correlation between the predicted and true all-sequence distributions.
(a) The correlation between the true distribution (in logarithm with highest affinity site set to 0) and that predicted by the PM generated from the weighted all binding sites. (b) The correlation between the true distribution (in logarithm) and that predicted by the PM generated from the weighted top 1% binding sites. (c) The correlation between the true distribution (in logarithm) and that predicted by the PM generated from the unweighted top 1% binding sites.