Predicting biomedical relationships using the knowledge and graph embedding cascade model

doi:10.1371/journal.pone.0218264

Fig 1.

Degree distribution of Chem2Bio2Rdf.

More »

Expand

Fig 2.

The framework of Node2vec.

Developed from Node2vec [27].

More »

Expand

Table 1.

The statistics of dataset Chem2Bio2Rdf.

More »

Expand

Fig 3.

Correlation between graph embedding and linking.

More »

Expand

Fig 4.

The framework of knowledge embedding cascade model.

More »

Expand

Fig 5.

Path prediction for a specific head and tail.

More »

Expand

Table 2.

Link prediction results (accuracy).

More »

Expand

Fig 6.

ROC Curve for part of relations on link prediction task.

More »

Expand

Table 3.

Link Prediction based on settings of different graph embedding algorithms (accuracy).

More »

Expand

Table 4.

Link Prediction based on settings of different knowledge embedding algorithms (accuracy).

More »

Expand

Fig 7.

Sensitivity of proposed cascade model to parameters of Node2vec.

More »

Expand

Table 5.

Entity prediction results.

More »

Expand

Table 6.

Entity prediction based on settings of different graph embedding algorithms (hits@10).

More »

Expand

Table 7.

Entity prediction based on settings of different knowledge embedding algorithms (hits@10).

More »

Expand

Table 8.

Hit@10 rate in each relation on biochemical data set.

n stands for the average number of head entities(respectively. tail entities) on dataset given a pair (r, t)(respectively (h,r)).

More »

Expand

Table 9.

Top 30 drug-disease-gene paths.

The relations treat, caused by, and bind are associated with, respectively, drug-disease, disease-gene and drug-gene. The value x/y of indicates whether or not the relation exists in a data set: x = 1 indicates the presence of a relation in training or test sets; y = 1 indicates the presence of a relation in databases DisGeNET, DrugBank, etc.

More »

Expand

Table 10.

Matching results of drug-disease-gene top 100 paths with database.

“Number of triplets” is the number of triplets in specific relation involved in top 100 paths. “Predictions” is the number of relations neither in data sets nor in chosen databases. “Proven predictions” is the number of relations not in data sets but matched with chosen databases.

More »

Expand