Evolutionary Genomics Implies a Specific Function of Ant4 in Mammalian and Anole Lizard Male Germ Cells

Most vertebrates have three paralogous genes with identical intron-exon structures and a high degree of sequence identity that encode mitochondrial adenine nucleotide translocase (Ant) proteins, Ant1 (Slc25a4), Ant2 (Slc25a5) and Ant3 (Slc25a6). Recently, we and others identified a fourth mammalian Ant paralog, Ant4 (Slc25a31), with a distinct intron-exon structure and a lower degree of sequence identity. Ant4 was expressed selectively in testis and sperm in adult mammals and was indeed essential for mouse spermatogenesis, but it was absent in birds, fish and frogs. Since Ant2 is X-linked in mammalian genomes, we hypothesized that the autosomal Ant4 gene may compensate for the loss of Ant2 gene expression during male meiosis in mammals. Here we report that the Ant4 ortholog is conserved in green anole lizard (Anolis carolinensis) and demonstrate that it is expressed in the anole testis. Further, a degenerate DNA fragment of putative Ant4 gene was identified in syntenic regions of avian genomes, indicating that Ant4 was present in the common amniote ancestor. Phylogenetic analyses suggest an even more ancient origin of the Ant4 gene. Although anole lizards are presumed male (XY) heterogametic, like mammals, copy numbers of the Ant2 as well as its neighboring gene were similar between male and female anole genomes, indicating that the anole Ant2 gene is either autosomal or located in the pseudoautosomal region of the sex chromosomes, in contrast to the case to mammals. These results imply the conservation of Ant4 is not likely simply driven by the sex chromosomal localization of the Ant2 gene and its subsequent inactivation during male meiosis. Taken together with the fact that Ant4 protein has a uniquely conserved structure when compared to other somatic Ant1, 2 and 3, there may be a specific advantage for mammals and lizards to express Ant4 in their male germ cells.


Details of the phylogenetic and evolutionary analyses presented in "Evolutionary Genomics Implies a Specific Function of Ant4 in Mammalian and Anole Lizard Male Germ Cells" by Lim et al.
Four nexus files are presented as supplementary information for Lim et al. These files provide the sequence alignments used for phylogenetic and evolutionary analyses as well as the relevant phylogenetic trees. The files are: • ANTalignment_annotated.nxs -alignment of 94 protein sequences in nexus format.
Analyses of this alignment (and a subset of 65 ingroup sequences) were conducted using RAxML 7.2.8. • ANT4_nuclaln_annotated.nxs -alignment of 7 nucleotide sequences, including 4 fulllength Ant4 coding sequences and 3 partial Ant4 pseudogenes sequences. This alignment includes the first 2 and last 3 nucleotides of the introns that are included in the pseudogene sequences. • ANTrooted.tre -ML tree based upon of 94 ANT protein sequences, including the yeast outgroup. • ANTingroup.tre -ML tree based upon of 65 ANT protein sequences, excluding the yeast outgroup and a number of divergent animal sequences.
These files are all in nexus format, allowing the commonly used phylogenetic analysis program PAUP* (http://paup.csit.fsu.edu) to be used for data management. Other programs that could also be used for data management, but examples of commands useful for data management are given only for PAUP*. Moreover, the files are written to echo instructions to the screen with proper formatting when executed in PAUP. Opening (or executing) the files in other programs may change details of the formatting. To make the information easier to read, all critical information is also presented in this file.
The tree files can be visualized in any program that reads nexus format (e.g., FigTree; http://tree.bio.ed.ac.uk/software/figtree) or loaded into PAUP* and visualized using the describetrees command. To make the information easier to read, the same information is presented in this file as supplementary figures.

Information about the annotated ANT protein sequence alignment
The annotation in the nexus file is presented below, with some edits and additional annotations (presented as italics) to improve readability in this format.
Alignment of ANT (mitochondrial adenine nucleotide translocase) proteins used in "Evolutionary Genomics Implies a Specific Function of Ant4 in Mammalian and Anole Lizard Male Germ Cells" by Lim et al.
This alignment has been trimmed to remove the variable length sequences at the amino-and carboxyl-termini. When executed in PAUP* (http://paup.csit.fsu.edu) you can reconstruct the "ingroup" taxon set by executing the following commands before analyzing or exporting the data: restore ingroup /only; exclude ingroup_gaps; (Many other nexus reader programs will allow similar manipulations of the data) Note that some taxon names are longer than allowed for a standard phylip format file, so exporting in phylip format is not recommended. To convert to a "relaxed phylip" format (i.e., a format suitable for analysis in programs like RAxML simply edit this nexus file in a text editor or execute the following commands in a unix environment: echo "94 309" > ANTalignment.phy sed -n "/Matrix/,/;/p" ANTalignment_annotated.nxs | sed "1d" | sed "s/;//g" >> ANTalignment.raxml.txt To produce the ingroup file substitute 'echo "65 301"' for the first command.

Sources of sequences:
Sequences from the Ensembl database are presented on the next page. For the chordate sequences from Ensembl, the protein accession codes can be generated from the sequences below by replacing the taxon name with the ENS code shown above (e.g., the taxon name Anole_P00000011672 should be changed to ENSACAP00000011672) Ensembl codes for human and anole genes are: Gene Human Anole ANT1 Human_P00000281456 Anole_P00000002268 ANT2 Human_P00000360671 Anole_P00000012600 ANT3 Human_P00000370808 Anole_P00000005895 ANT4 Human_P00000281154 Anole_P00000011672 As described above, these names can be converted to Ensembl protein codes by changing Human_ to ENS and Anole_ to ENSACA.
Note that the full and aligned sequences are presented in the text file.
Nexus format tree files for the ML analyses of this dataset (using the LG+G+F model) are also available. They can be visualized using programs such as FigTree (available from http://tree.bio.ed.ac.uk/software/figtree).

Information about the rooted ML tree estimated using the protein sequence alignment
The following tree was estimated in RAxML (http://wwwkramer.in.tum.de/exelixis) using the LG+Γ+F model and rooted to the yeast sequences that were included (YMR056C, YBL030C, YBR085W). See above for details about the data matrix.