Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs
(A) On the basis of the gene locus and ORF age assignments, hominoid-specific de novo protein-coding genes were identified. Regions within dotted red lines indicate the repeating steps for each out-group species. We further filtered this list using stringent inclusion criteria and generated a smaller convincing list of 24 de novo genes. (B) Distribution of protein length for the 24 de novo genes, compared with the human genome as background. (C) Distribution of summed RPKM scores of the 24 de novo genes in seven human tissues, compared with the human genome as background. (D) Pie chart showing the distribution of the 24 de novo protein-coding genes in terms of the reuse of preexisting transcriptional context. Gene numbers in each category are marked. None: no evidence for the reuse of transcriptional context; bi: located downstream of bi-directional promoter; +: overlapping with preexisting genes on the same strand; −: overlapping with preexisting genes on the opposite strand. (E) Venn diagrams showing the contribution of Alu sequences to exons and splicing junctions in de novo protein-coding genes.