Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs
(A) An example of de novo gene ENST00000315302 partially overlapped with a pre-existing gene ODZ3, transcribed by the other strand of the DNA. The ortholog of ENST00000315302 in rhesus macaque was aligned according to genome-wide multiple alignments in UCSC. The junction reads generated by strand-specific RNA-Seq assays are highlighted by black bold lines, with fragments of junction reads crossing splicing junctions connected by thinner lines. The mapped reads well supported the transcription of the target de novo gene on the reverse strand, as most reads appeared in the track for ‘reads transcribed from the minus-strand’. Regions for all four splicing junctions are highlighted in dotted boxes and expanded in (B), including three in ENST00000315302 transcribed from the minus strand and one from the other strand. All of these splicing junctions were well supported by the RNA-Seq reads mapped on the corresponding strand of the DNA. Vertical dotted lines in brown or blue highlight the exon boundaries in transcripts on the minus or plus strands, respectively. (C) Demo case for a discarded de novo gene in the manual curation process, in which the RNA-Seq data in rhesus macaque were not consistent with the putative splicing pattern predicted on the basis of human gene models. The common disabler is marked with a red star, and this was actually spliced out in rhesus macaque as indicated by the junction reads. Scale bar shown as benchmark for gene size.