Figure 1.
Evolutionary Scenario S1, Describing the Case where the Pseudogene Originated before the Species Split and Has Acquired as well as Maintained Function
G and ψ on tree branches refer to gene and pseudogene evolution, respectively.
Figure 2.
Evolutionary Scenario S2, Describing the Common Case of Late and Independent Pseudogene Origin
Figure 3.
Evolutionary Scenario S3, Describing Independent Transitions
Figure 4.
Visualization of the Effect of Mutual-Best-Hit Filtering
The tree shows the evolutionary history for a sequence set associated with the ATXN7L3 orthologous proteins. We have here found two potentially pseudogenic sequences in each species and this gives us a total of four quartets to investigate; the gene-sequence pair together with any human–mouse combination of the pseudogenes. It is unlikely that the human chrX pseudogene is closely related to any of the mouse ones and therefore any quartet including the sequence from the X chromosome should be of limited interest. If we pair a particular human pseudogene only with the most similar mouse pseudogene (and vice versa), the sole remaining example is the human chr12–mouse chr10 pair.
Figure 5.
Histogram of Likelihood Quotients when Comparing Scenarios S1 and S2
Figure 6.
Histogram of Likelihood Quotients when Comparing Scenarios S1 and S3
Table 1.
Number of Human–Mouse Sequence Pairs prior to and following Mutual-Best-Hit Filtering
Table 2.
Number of Sequence Pairs in Each Class Favoring a Particular Scenario
Table 3.
p-Values for Scenario Comparisons and Pseudogene Expression Evidence (Number of Matching EST and mRNA Sequences) for the 20 Syntenic S1 Quartets
Figure 7.
S3 Topology with Gene-to-Pseudogene Breakpoints
refers to the length of the branch on which the human pseudogene has evolved genelike, and similarly for
,
, and
.
Table 4.
Human and Mouse Expression for 262 S3 Quartets Selected as Described in Table 1
Figure 8.
Conservation between Human and Mouse Gene and Pseudogene Sequences for the 20 Syntenic S1 Sequences
Blue stars indicate genes. Red circles indicate pseudogenes. The histogram shows, for reference, the conservation of all genes giving rise to pseudogenes. Compare with Table 5, which lists the same data.
Table 5.
Conservation Percentage in and around the Pseudogene
Figure 9.
Recognizing Pseudogenes by Inspecting Their Alignment
(A) An alignment, visualized with TeXshade [34], of the processed copies to the ATXN7L3 human and mouse protein-coding genes. The human as well as the mouse ATXN7L3 contains 12 exons, which are all present in the respective duplicates. Approximate exon borders are shown in yellow.
The most interesting part consists of columns 1–468 (boxed green), which according to several EST and mRNA sequences is the only segment expressed. It consists of a highly conserved part, 1–288 (red), which is a potential open reading frame, followed by part 289–468 with pseudogenic disablements.
(B) Selected parts of the alignment of the ATX1 copies which are also processed. The protein-coding genes contain eight exons of which only parts of the last two code for protein. The entire segment of the pseudogenes corresponding to the protein-coding parts of the genes is expressed. The possibility that the processed copies are protein-coding cannot not be completely ruled out, however. Indeed, each pseudogene consists of one single 2,068-bp-long open reading frame. However, the frame induced by the alignments to the protein-coding genes contains several pseudogenic disablements.
Table 6.
Percentage of Expressed Pseudogenes in Relation to Their Conservation p-Values (Calculated with Hoeffding's Bound)
Table 7.
Human–Chimpanzee Conserved and Expressed Pseudogene Pairs
Figure 10.
Flow Diagram over the Pseudogene Assignment Process