< Back to Article

Heterogeneous Network Edge Prediction: A Data Integration Approach to Prioritize Disease-Associated Genes

Fig 4

Feature selection identifies a parsimonious yet predictive model.

Ridge and lasso models were fit from the complete network. The resulting standardized coefficients (y-axis) assess the effect size of each feature (x-axis). Brackets indicate features from MSigDB-traversing metapaths (Gm{}mGaD). The ridge model disperses effects amongst features whereas the lasso concentrates effects. The lasso identifies an 8-feature model with minimal performance loss compared to the ridge model. Besides KEGG, gene-set based features were largely captured by Perturbations. The lasso retains several measures of pleiotropy as well as the one-step interactome feature (GiGaD).

Fig 4