An information theoretic treatment of sequence-to-expression modeling
Fig 4
(A) Modeling SIM enhancer. Expression profile of sim and all TFs that are involved in sim regulation, shown for ventral-most bins 1–25 of the 50 bins along D/V axis. Su(H) is modeled as both an activator and a repressor. (B) Ensemble of models that predict SIM expression profile accurately. (C) Information Gain from different perturbation experiments. Each experiment represents the readout of a variant of the wild-type enhancer, under wild-type conditions, and is named for the variant enhancer. (D) Entropy of filtered ensemble for each of the nine experiments, as defined under the specially constructed probability distribution presented in this work (Y axis) or under a discrete uniform distribution (X axis). (E-G) The ‘Mesectoderm2.2’ variant of the sim enhancer [52] (S2 Table) has been observed to recapitulate the expression pattern of the wild-type enhancer (E). In contrast, majority of the models in the wild-type ensemble (F) predict expansion of the dorsal boundary, but an ensemble filtered by the ‘2.8simΔSD16’ experiment predicts the known (non-)effect correctly.