Fig 1.
The relationship between the number of genes and diseases.
The histogram illustrates the number of genes and their related diseases.
Fig 2.
Schematic overview of identifying the disease-gene associations.
Neighboring proteins of a query protein are sought using the inversion score from the STRING database, and their related diseases from DisGeNET are mapped. The disease relationship between two proteins is calculated using an association index. Then, the association scores of probable disease-gene pairs are calculated. Promising candidates were identified using gene and disease selections.
Table 1.
Summarization of standard association indices for calculating disease relationships.
Fig 3.
Receiver operating characteristic curves for the predictions of disease-gene association.
The disease-gene prediction results at the optimal parameter k = 18.
Fig 4.
The highest F-measure from cross-validation results from the DGA algorithm with different disease relationships calculated using the Jaccard, Simpson, Geometric, and Cosine indices.
Fig 5.
Comparing performance of our method and random experiment.
The disease-gene association predictions using DGA algorithm with disease relationships from the Jaccard index and random experiments with different values of k were compared.
Fig 6.
Receiver operating characteristic curves for the prediction of disease-gene associations using interactions from protein complexes.
The disease-gene prediction results using our DGA algorithm with interactions from protein complexes obtained from the CORUM database.
Table 2.
List of genes that met the selection criteria with coverage value of a gene greater than or equal to 70 and an association score greater than or equal to 40.
Table 3.
List of predicted disease and gene pairs that were not evident in the gold standard.