Learning clinical networks from medical records based on information estimates in mixed-type data
Fig 2
Optimum bivariate discretization for mutual information estimate.
The proposed information-maximizing discretization scheme is illustrated for a joint distribution defined as a Gumbel bivariate copula with parameter θ = 5 and marginal distributions chosen as Gaussian mixtures with three equiprobable peaks and respective means and variances, μX = {0, 4, 6}, σX = {1, 2, 0.7} and μY = {−3, 6, 9}, σY = {2, 0.5, 0.5}. The information-maximizing partition yields (A) IN(X; Y) = 1.04 for N = 500 samples and (B) IN(X; Y) = 1.142 for N = 10, 000 samples, as compared to the exact expected value I(X; Y) = 1.205 computed with numerical integration. See S1 Fig for additional results. Codes are provided at https://github.com/vcabeli/miic_PLoS.