A metric for evaluating biological information in gene sets and its application to identify co-expressed gene clusters in PBMC
Fig 1
The GECO metric scoring process; from cluster assignment to GECO score.
Transcriptomic data was clustered using k-means clustering. Each cluster contained both ground truth genes (seen in blue) and non-ground truth genes (seen in grey). The scoring function applied to each gene in each cluster. The gene score reflects the likelihood of that gene being a ground truth gene based on the makeup of the cluster to which the gene belongs and the distribution of ground truth genes throughout the dataset. A table containing all the genes in the dataset, scored by cluster, and their associated gene scores. The gene scores are used to generate a ROC plot and the corresponding AUC value is the GECO metric which indicates the overall quality of the clusters.