Machine learning on multiple epigenetic features reveals H3K27Ac as a driver of gene expression prediction across patients with glioblastoma
Fig 3
Cross-patient prediction methodology using the model XGBoost architecture.
The model input for training and validation is derived from a patient (GSC1) different from the testing dataset. As shown, the matrices are flattened before going into the model, where the RNA-seq value is predicted. A) A functional view of the cross-patient experimental setup where the model training is illustrated on the left side of the image and the right, the transition to testing with the trained model. B) A conceptual view of the cross-patient experimental setup, which illustrates the dataset allocation and number of observations per feature in the data for training and testing.