A Human-Specific De Novo Protein-Coding Gene Associated with Human Brain Functions

Figure 1

Gene structure of FLJ33706, a human-specific de novo protein-coding gene.

Data for the tracks ‘Spliced Human EST’ and ‘Human mRNA’ was extracted and assembled from UCSC Genome Browser. We re-sequenced all available mRNAs and spliced ESTs, shown in the track ‘Re-sequenced ESTs/mRNAs’. On the basis of these data, we inferred gene structure for this novel gene, with six exons marked as ‘1∼6’ in the track ‘FLJ33706 Gene Structure’. The exons partially derived from re-sequenced data were highlighted in green. An ORF with two short coding exons located at exon 3 and exon 4 was identified to encode a 194-amino-acid-long peptide (track ‘Open Reading Frame (ORF)’). Newly inserted transposable elements, especially Alu sequences, contributed substantially to the formation of the first coding exon and six standard splicing junctions on the branch leading to human and chimpanzee, marked as ‘a∼f’ in the track ‘FLJ33706 Gene Structure’. All repeat elements in this region were shown in track ‘RepeatMasker’, extracted from UCSC Genome Browser. Coding exons in tracks ‘Spliced Human EST’, ‘Human mRNA’ and ‘FLJ33706 Gene Structure’ were represented by higher vertical bars, while UTR regions and intronic regions were represented by lower vertical bars. Size scales were added in the figure to give benchmarks for gene sizes. Tracks with different size scales were separated by horizon lines.

