Figure 1.
Description of the DGD database development process, from sequence similarity analyses and integration of gene annotation data from NCBI, Ensembl and HGNC websites to the integration and computation of functional data from GEO (Gene Expression Omnibus) and GOA (Gene Ontology Annotation).
Figure 2.
Distribution of the number of groups of duplicated genes according to number of duplicated genes.
BTA: Bos taurus; CAF: Canis familiaris; DER: Danio rerio; ECA: Equus caballus; GGA: Gallus gallus; HSA: Homo sapiens; MMU: Mus musculus; RNO: Rattus norvegicus and SSC: Sus scrofa.
Table 1.
Statistics on DGD content.
Table 2.
Statistics for the groups of duplicated genes.
Figure 3.
Proportion of significant correlations.
Boxplots of significant correlations of expression for duplicated genes (blue), non-duplicated genes (orange) and randomly-selected genes (yellow). (A) Correlations for all groups of genes. Means with a different letter are significantly different according to Student’s R t-tests at p<0.05 (n = 3320, 2760 and 13605, respectively). (B) Correlations according to the number of genes within groups. For every group size, the means of each type of group are significantly different (p<0.05).
Figure 4.
Distribution of semantic similarities.
(A) Distribution of GO biological process semantic similarities in duplicated gene groups (blue) vs. randomly-selected gene groups (yellow). Means with a different letter are significantly different according to Student’s R t-tests at p<0.05. (B) Details of the same distribution with groups pooled by size. The mean of each duplicated group is significantly different from the mean of each randomly-selected genes group (p<0.05). Note: no data were available for the group with 11 genes.