Consequences of Lineage-Specific Gene Loss on Functional Evolution of Surviving Paralogs: ALDH1A and Retinoic Acid Signaling in Vertebrate Genomes
(A) Composite dotplot representing the distribution of paralogs of genes (black dots) within a 10 Mb-window surrounding each member of the ALDH1A family throughout the human genome (red crosses: ALDH1A-neighbor paralogs; dark blue crosses: ALDH1A2-neighbor paralogs, and light-green crosses: ALDH1A3-neighbor paralogs. The genomes of Ciona intestinalis and Branchiostoma floridae, which represent urochordates and cephalochordates, respectively, the two closest vertebrate relatives ,, were used as outgroups to define paralogy groups in the human genome. Gene accession numbers and genomic information for each group of paralogy represented in the dotplot is provided in Table S1. Human chromosomes are represented in the y-axis, and drawn to scale in the x-axis with the p-terminus of each chromosome at the left and the q-terminus at the right of each white row. Chromosomal regions that appear enriched in Aldh1a-neighbor paralogs are indicated with colored boxes, highlighted in pink if ALDH1A genes are present, and in yellow if no ALDH1A genes are present. The distribution of paralogs of genes located in the yellow boxes is also represented in the dotplot (golden crosses: Hsa1 ALDH1A-related GN; black crosses: Hsa5 ALDH1A-related GN; pink crosses: Hsa9 ALDH1A-related GN; brown crosses: Hsa15 ALDH1A-related GN). (B) Two clusters of genes in Hsa15 and Hsa9 display a substantial number of conserved syntenies between the ALDH1A1 and ALDH1A2 gene neighborhoods (red lines), but fewer conserved syntenies with the ALDH1A3 GN (green lines), supporting the idea that ALDH1A1 and ALDH1A2 are the closest sister paralogs, consistent with the phylogenetic tree ((ALDH1A1, ALDH1A2), ALDH1A3) in Figure 1. Golden lines show conserved synteny between Hsa15 and parts of Hsa9 probably due to a local transposition that moved material between ALDH1A2 and ALDH1A3 from its original location to the right of ALDH1A1 to the left of ALDH1A1 (or vice versa). Colored boxes correspond to regions shown in A. Figure S2 provides high-resolution images including the name of conserved syntenic genes. (C) Representation of a pair of paralogous gene clusters in Hsa15 and Hsa5 displaying high amounts of conserved synteny between the ALDH1A3 and ALDH1A3-ogm GNs (green lines). (D–F) Circleplots display the patterns of conserved synteny between the ALDH1A GN (labeled with black arcs outside of each chromosome) revealed by the dotplot in A for Hsa1, Hsa5, Hsa9, Hsa15 and Hsa19 (see main text for explanations). While the patterns of conserved synteny between ALDH1A1 and ALDH1A2 GNs (red lines) and between ALDH1A3 and ALDH1A3-ogm GNs (green lines) are restricted to defined dense bundles (D), lines originating from Hsa1 (E) and Hsa19 (F) are not restricted to any particular ALDH1A GN (the different colors of the lines in E and F label various chromosomes). Circles represent chromosome centromeres, and dotted arcs label the approximate ALDH1A GN positions in the chromosomes.