Skip to main content
Advertisement

< Back to Article

Figure 1.

Four-step pipeline to assign gene origin, conservation, and duplicability.

(A) All unique genes of the four species are assigned to clusters of orthologs with different inclusiveness. (B) All 373 species present in EggNOG [19] are associated with seven internal nodes of the tree of life. (C) Orthologs and paralogs of each gene are identified in the seven internal nodes. (D, E, F) These pieces of information are combined to identify origin, conservation, and duplicability of each gene. LUCA, last universal common ancestor.

More »

Figure 1 Expand

Table 1.

Gene sets used in the analysis.

More »

Table 1 Expand

Figure 2.

Origin, conservation, and duplicability of genes in evolution.

(A) The percentage of genes that originated at each internal nodes of the tree of life is shown for the four species used in the analysis, and for seven additional species. The group-specific nodes correspond to primates for H. sapiens, rodents for M. musculus, birds for Gallus gallus, fishes for D. rerio, nematodes for C. elegans, insects for D. melanogaster and A. mellifera, fungi for S. cerevisiae and Schizosaccharomyces pombe, and bacteria for E. coli and Bacillus subtilis. The lack of specific genes for C. elegans, G. gallus and M. musculus is likely an artifact due to presence in EggNOG of few species for the corresponding group-specific nodes. LUCA, last universal common ancestor; euk, eukaryotes; opi, opisthokonts; met, metazoans; ver, vertebrates. (B) The percentage of genes that have the same conservation is shown for each species. Conservation is measured as the number of internal nodes where no ortholog is found since the gene appeared. In all species, conservation ranges from 0 (i.e. no missing node) to 5 (i.e. the gene originated with LUCA and has orthologs only in prokaryotes and in the group-specific cluster). Since only few genes have conservation 5, we grouped them with genes with conservation 4. (C) The percentage of singleton and duplicated genes is shown for all eleven species.

More »

Figure 2 Expand

Table 2.

Protein interaction networks.

More »

Table 2 Expand

Figure 3.

Relationship between gene origin and conservation and network properties.

Degree (connectivity) and betweenness (centrality) are compared between (A) proteins that originated at a given node and younger or older proteins; and (B) proteins with a given conservation and less or more conserved proteins. In both analyses, the differences are assessed with the Wilcoxon test and the resulting p-values are transformed into heatmaps. Each square represents genes that originated at a given internal node or with a given level of conservation. The color represents the p-value. Red is associated with more connected or more central proteins, green is associated with less connected or less central proteins. The lower bound of p-values is set equal to 10−3.

More »

Figure 3 Expand

Figure 4.

Properties of ancient and recent hubs.

(A) Degree (connectivity) and betweenness (centrality) of proteins encoded by duplicated and singleton genes of same age are compared using the Wilcoxon test and the obtained p-values are transformed into heatmaps. Each square represents genes that originated at a given internal node and the color represents the p-value. Red indicates that duplicated genes encode significantly more connected or more central proteins than singleton proteins; green indicates that proteins encoded by singleton genes are significantly more connected or more central than duplicated proteins. The lower bound of p-values is set equal to 10−3. (B) Functional differences are analyzed between (1) ancestral and recent human hubs; (2) all ancestral and all recent human genes; (3) all singletons and all duplicated human genes. For each comparison, significance is assessed with Fisher's exact test and the p-values are adjusted for the False Discovery Rate (FDR). Vertical bars correspond to individual GO terms that are further grouped into 12 functional categories. Blue bars represent the enrichment of duplicated, recent genes, or hubs, orange represents the enrichment of singletons, ancient genes, or hubs.

More »

Figure 4 Expand

Figure 5.

Dosage regulations of human hubs.

(A) The fraction of human duplicated hubs that are ohnologs, miRNA targets, and tissue-selective genes is compared to the corresponding fraction of singleton hubs. Although the main contribution is due to ohnologs (B), the enrichment still remains detectable when miRNA targets (C) and tissue-selective genes (D) are considered separately. Small-scale duplications refer to duplicated hubs that are not the result of whole-genome duplication (i.e. they are not within the dataset of ohnologs). Since the number of hubs that originated with opisthokonts and primates is only 43 and 17, we group them with hubs that originated with eukaryotes and mammals, respectively. * significant enrichment when compared to older genes (Fisher's exact test).

More »

Figure 5 Expand

Figure 6.

Dosage regulation of the atrophin genes.

Atrophins are metazoan-specific genes that underwent duplication in vertebrates. The fly ortholog Atro is highly dosage sensitive: increased and reduced expression due to modifications of miR-8 lead to neurogenerative and survival defects [38], [39]. Rere, one of the two vertebrate atrophin paralogs, is target of mir200b and miR-429, the vertebrate counterparts of miR-8. Dosage modifications of Rere lead to re-localization of the other paralog, ATN1, in the nucleus, upon direct binding [41]. Interestingly, ATN1 is the gene responsible for the dentatorubral-pallidoluysian atrophy (DRPLA) [40].

More »

Figure 6 Expand