Figure 1.
Process flow of identification of therapeutic targets for oral cancer.
Table 1.
Dataset Details.
Figure 2.
Literature Mining Process Flow.
Table 2.
A 2×2 contingency table is built on search statistics for BIRC5.
Figure 3.
Data Attributes Before and After Batch-Correction.
Samples are depicted as colored dots in PCA plots, “red” and “green” colored dots represents cancer and control samples, respectively, from Ambatipudi et al., 2012, whereas “blue” and “cyan” colored dots represents cancer and control samples, respectively, from Peng et al., 2011. The plots (a) and (b) are PCA and Power distribution plot for dataset before batch correction. The plots (c) and (b) are PCA and Power distribution plot for dataset after batch correction by ComBat. The plots (a) and (b) are PCA and Power distribution plot for dataset after batch correction by XPN.
Figure 4.
Significantly overexpressed genes are represented as ‘red’ dots and significant underexpressed genes are represented as ‘green’ dots in volcano plot. The names of some of the highly under- and over-expressed genes can be seen at left and right side respectively, of the volcano plot.
Figure 5.
The Consolidated Causal Network.
The genes are depicted as nodes of causal network. The hypotheses genes are distinctly colored as ‘red’ or ‘blue’ representing their over- or under-expression respectively, observed in study dataset. Relationships are depicted as edge or arrow in causal network. The solid arrow represents ‘activation’ relationship between connected nodes, whereas dashed arrow represents ‘inhibition’ relationship between the connected nodes. The node which has been identified as hypothesis gene, and also downstream gene for some other hypothesis, has been marked with an extra peripheral surrounding.
Figure 6.
Literature Mining Result Statistics.
Figure 7.
List of potential therapeutic targets for oral cancer.
The right sign ‘✓’ represents significant publication evidence to support association of concerned target gene with a cancer hallmark mentioned in a concerned column, and ‘’ represents absence of such association between gene and cancer hallmark. The ‘
’ sign represents significant overexpression of the gene, and ‘
’ represents significant under-expression of the gene, observed in oral cancer in study dataset. ‘CausalNet Degree’ is the no. of causally connected genes to the particular target gene. ‘Diff’ is difference in the no. of connections in dependency network, under cancer and control condition for the concerned target gene. ‘MN’ means that annotations for the concerned target gene was inferred from articles related with mouth neoplasm or oral cancer, whereas ‘C’ means that annotations are not specific to oral cancer and were inferred using generic term ‘neoplasms’ or cancer.