Fig 1.
Degree distribution of Chem2Bio2Rdf.
Fig 2.
Developed from Node2vec [27].
Table 1.
The statistics of dataset Chem2Bio2Rdf.
Fig 3.
Correlation between graph embedding and linking.
Fig 4.
The framework of knowledge embedding cascade model.
Fig 5.
Path prediction for a specific head and tail.
Table 2.
Link prediction results (accuracy).
Fig 6.
ROC Curve for part of relations on link prediction task.
Table 3.
Link Prediction based on settings of different graph embedding algorithms (accuracy).
Table 4.
Link Prediction based on settings of different knowledge embedding algorithms (accuracy).
Fig 7.
Sensitivity of proposed cascade model to parameters of Node2vec.
Table 5.
Entity prediction results.
Table 6.
Entity prediction based on settings of different graph embedding algorithms (hits@10).
Table 7.
Entity prediction based on settings of different knowledge embedding algorithms (hits@10).
Table 8.
Hit@10 rate in each relation on biochemical data set.
n stands for the average number of head entities(respectively. tail entities) on dataset given a pair (r, t)(respectively (h,r)).
Table 9.
Top 30 drug-disease-gene paths.
The relations treat, caused by, and bind are associated with, respectively, drug-disease, disease-gene and drug-gene. The value x/y of indicates whether or not the relation exists in a data set: x = 1 indicates the presence of a relation in training or test sets; y = 1 indicates the presence of a relation in databases DisGeNET, DrugBank, etc.
Table 10.
Matching results of drug-disease-gene top 100 paths with database.
“Number of triplets” is the number of triplets in specific relation involved in top 100 paths. “Predictions” is the number of relations neither in data sets nor in chosen databases. “Proven predictions” is the number of relations not in data sets but matched with chosen databases.