Figures
Abstract
The olfactory receptor (OR) multigene family is responsible for the sense of smell in vertebrate species. OR genes are scattered widely in our chromosomes and constitute one of the largest gene families in eutherian genomes. Some previous studies revealed that eutherian OR genes diverged mainly during early mammalian evolution. However, the exact period when, and the ecological reason why eutherian ORs strongly diverged has remained unclear. In this study, I performed a strict data mining effort for marsupial opossum OR sequences and bootstrap analyses to estimate the periods of chromosomal migrations and gene duplications of OR genes during tetrapod evolution. The results indicate that chromosomal migrations occurred mainly during early vertebrate evolution before the monotreme-placental split, and that gene duplications occurred mainly during early mammalian evolution between the bird-mammal split and marsupial-placental split, coinciding with the reduction of opsin genes in primitive mammals. It could be thought that the previous chromosomal dispersal allowed the OR genes to subsequently expand easily, and the nocturnal adaptation of early mammals might have triggered the OR gene expansion.
Citation: Kishida T (2008) Pattern of the Divergence of Olfactory Receptor Genes during Tetrapod Evolution. PLoS ONE 3(6): e2385. https://doi.org/10.1371/journal.pone.0002385
Editor: Jean-Nicolas Volff, Ecole Normale Supérieure de Lyon, France
Received: February 28, 2008; Accepted: April 22, 2008; Published: June 11, 2008
Copyright: © 2008 Takushi Kishida. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The author has no support or funding to report.
Competing interests: The author has declared that no competing interests exist.
Introduction
Tetrapods can recognize various environmental odors using olfactory receptors (ORs). ORs belong to the superfamily of seven transmembrane G-protein coupled receptors (GPCRs) and consist of one of the largest multigene families in vertebrate genomes [1]–[3]. The number of functional OR genes varies greatly among terrestrial tetrapod species, ranging from approximately 80 genes in chickens to 1200–1600 genes in rats [4]–[6]. It has been reported that the number of functional OR genes appears to parallel the reliance on the sense of smell in a species. The fraction of OR pseudogenes seems to have increased in the primate lineage leading to humans [7], suggesting a reduced dependence on olfaction as a result of the acquisition of full trichromatic color vision [8]. Large scale degeneration of OR genes is found in cetaceans, which have secondarily adapted to a marine habitat and have lost or greatly reduced their sense of smell acquired in terrestrial environments [9]. These findings suggest that the number of intact OR genes reflects the ability of odor recognition in tetrapod species.
OR genes are scattered widely in placental mammalian chromosomes. For example, OR genes can be found on every human chromosome except for chromosomes 20 and Y [10]. Using the limited data available in 2001 [11], Glusman et al. estimated that OR genes had migrated from chromosome 11 to other chromosomal regions mainly before 310 MYA, before the mammal-bird split. They also indicated that OR genes were evolutionarily relatively stable between the mammal-bird split and placental-marsupial split during vertebrate evolution.
Recently, an SWS2 class opsin gene, which encodes one of the four spectrally distinct classes of vertebrate cone pigment and has never been found in marsupial or placental mammals, was found in a monotreme platypus, suggesting that placental mammals lost their sense of color vision gradually during early mammalian evolution between the mammal-bird split and the placental-marsupial split [12]. As mentioned above, primates have compensated for their reduced sense of smell by acquisition of trichromatic color vision, and it could also be hypothesized that primitive mammals compensated for their reduced sense of color vision by enlargement of the size of their OR repertoires. This hypothesis suggests that the size of our OR repertoires expanded mainly in the period between the mammal-bird split and the placental-marsupial split, and that the placental-marsupial last common ancestor (LCA) had acquired a large number of ORs. However, Glusman et al. estimated that OR genes expanded mainly after the placental-marsupial split [11].
In this study, I have performed a data mining effort for marsupial opossum OR genes strictly and estimated the evolutionary change of chromosomal migration and the size of OR repertoires in the tetrapod lineage leading to modern Euarchontoglires using 6 genome-sequenced tetrapod species, including opossum, and following the method designed by Suga et al. [13]–[14] for estimating the periods of gene migrations and duplications.
Results and Discussion
Table 1 shows the number of opossum OR genes identified in this study (available as supporting Text S1). The total number of OR genes generally agreed with other independent reports [5]–[6]. The pseudogene fraction might have been underestimated because a number of pseudogenes would be included in the partial intact genes. In addition to the opossum OR gene database, previously reported OR gene databases for 5 tetrapods (Table 2) were used in this study. Partial sequences and pseudogenes were excluded from further analyses because their inclusion would have sharply reduced the alignment regions. The chromosomal distribution of mouse OR genes, obtained from the Trask Laboratory mouse OR gene database (http://www.fhere.org/science/labs/trask/OR/), is shown in Table S1.
Bootstrap analyses were performed by the standard procedure with 100 resamplings, modified from the method designed by Suga et al. [13]–[14] (Text S2), in order to calculate the number of chromosomal migrations (Fig. 1) and the number of gene duplications (Fig. 2). Fig. 1 indicates that chromosomal migrations occurred mainly during early vertebrate evolution before the monotreme-placental split. In contrast, regarding gene duplications, Fig. 2 indicates that OR genes were duplicated mainly in the period between the mammal-bird split and the placental-marsupial split. These results suggest that chromosomal dispersal occurred ahead of gene expansion.
The distribution of the number of chromosomal migrations was calculated by repeating the bootstrap resampling procedure [28] 100 times and by constructing the phylogenetic tree for each resampling procedure based on the neighbor-joining method [29]. Mean±standard deviation is shown in the figure. In mammals, only class II ORs were taken into account because all class I ORs are located in the same chromosomal region [11].
Mean±standard deviation is shown in the figure. The detailed data for mammals are shown in Table S2.
Vertebrates are known to have developed well-established tetrachromatic color vision before the fish-tetrapod split [15]. The vertebrate tetrachromatic color vision relies on four spectrally distinct classes of cone pigment encoded by distinct opsin genes: SWS1, SWS2, Rh2 and LWS classes [16]. It has been reported that placental mammals lost the SWS2 and Rh2 classes after the bird-mammal split and now retain only the LWS and SWS1 classes, and this loss is thought to have occurred because of the nocturnal lifestyle of primitive mammals [16]. Some Australian marsupials are suggested to have evolved trichromatic color vision [17]. As yet, however, in spite of substantial efforts, no SWS2 or Rh2 opsin genes have been identified in any marsupial genomes [16], which strongly suggests that mammals had degenerated into having dichromatic color vision before the placental-marsupial split. Recently, an SWS2 class opsin gene was found in the platypus genome [12], indicating that the SWS2 class opsin gene was lost in the placental mammalian lineage after the placental-monotreme split. On the other hand, no Rh2 class opsin gene was found in the platypus genome [12], which suggests that the Rh2 class might have been lost before the placental-monotreme split. Considering all these things, it could be concluded that mammals lost their sense of color vision gradually between the mammal-bird split and the placental-marsupial split because of nocturnal adaptation. In this study, Fig. 2 indicates that a large-scale duplication of OR genes occurred in the placental mammalian lineage between the mammal-bird split and the placental-marsupial split, and it appears that the expansion of OR genes coincided with the reduction of opsin genes. A nocturnal lifestyle would have required a well-established sense of smell regardless of the sense of color vision. It can be said metaphorically, that the chromosomal scattering of OR genes would have been a fuse for an explosive, and the nocturnal adaptation might have triggered the OR gene expansion.
However, phylogenetic analysis suggested that one subgroup of OR genes called family 7 [18], which comprises the largest subgroup in the human OR gene repertoire [11], diverged after the placental-marsupial split (Fig. 3). Interestingly, the family 7 subgroup contains some receptors which are thought to have become necessary very recently during mammalian evolution, such as the human OR7D4 receptor, which is activated only by androstenone or androstadienone pheromones [19]. Further studies of the family 7 subgroup could be expected to reveal some interesting aspects of modern eutherian evolution.
The tree was constructed by the neighbor-joining method [29], based on the Poisson correction distance [30] matrices. OTUs written using red fonts indicate opossum ORs, blue fonts indicate dog ORs and green fonts indicate mouse ORs. Five human ORs belonging to other subgroups were used as outgroups. Bootstrap values were obtained by 1000 resamplings, and the values >60% are shown.
Finally, the estimated numbers of OR genes possessed by our ancestors are shown in Table 3. The estimation method is detailed in the ‘Materials and methods’ section. The estimated gene numbers indicate that OR genes diverged gradually, with the major divergence occurring during early mammalian evolution, between the mammal-bird split and the placental-marsupial split. The estimated numbers in the monotreme-placental LCA and the marsupial-placental LCA (which would be underestimated, as explained in ‘Materials and methods’) are much larger compared to those in a previous report [6], perhaps due to the following facts: (i) the previous method did not consider the number of genes which were lost in both lineages after speciation ( = bc/a, according to Eq. 9'), and (ii) the previous method adopted the condensed tree method [30] for evaluating the reliability, which must underestimate the number of LCA genes because ambiguous subtrees would not be considered in the condensed trees. The other OR databases are also analyzed and the results essentially support the main conclusion of this study (Table S4).
Materials and Methods
1. Collecting opossum OR repertoire
The opossum OR gene repertoire was constructed from the opossum draft genome sequence database downloaded from the Ensembl trace server (ftp.ensembl.org/pub/traces/monodelphis_domestica) on 20/DEC/2004 (ver. e!27). For each sequence, regions with low-quality scores (quality value<10, according to the quality files) were cut off to get reliable data. The TFASTY program [20] was carried out against these genome sequences to identify OR coding regions using human, mouse and zebrafish known OR gene sequences as queries. As a result, 8410 OR related sequences were obtained.
In order to merge sequences which come from the same OR gene, two sources of intralocus variation must be taken into account: interallelic variation and sequencing error. I tried four conservative stringency values, 98.0%, 98.5%, 99.0% and 99.5%. Except for the value of 99.5%, there was at least one group of three sequences which did not satisfy the transitive law, i.e. seq. A = seq. B and seq. B = seq. C, but seq. A ≠ seq. C. Therefore, I opted for a conservative stringency value of 99.5% with >100bp overlap to minimize erroneous clone merging. Finally, the sequences were aligned with known OR genes to identify the amino acid coding regions. All sequences were searched against the entire GenBank using the BLAST program [21] to ensure that their best three hits were known ORs.
2. Phylogenetic analyses
It has been reported that large mammalian OR genes can clearly be classified into two subfamilies (class I and class II) based on the sequence similarity, while non-mammalian OR genes cannot be as easily classified as mammalian ORs because of their wide diversity [3], [22]. In this study, mammalian OR sequences were divided into two subfamilies and each subfamily was analyzed independently to obtain more accurate results. Dog ORs were classified according to the classification in the HORDE database ([23], http://bioportal.weizmann.ac.il/HORDE/). Platypus, opossum and mouse OR sequences were searched against the HORDE#42 human OR database using the FASTA3 program [24] and classified into class I or II subfamilies according to their most similar human sequences.
Deduced amino acid sequences of OR genes in compared species were aligned using the MAFFT program [25] with manual adjustments. Positions with alignment gaps were excluded from further analyses. The root of the tree of vertebrate OR genes is difficult to determine because even the closest non-OR GPCR gene is too divergent to provide accurate root information. In this study, an amphioxus GPCR gene (amphi-GPCR1, GenBank accession no. AB182635) was used as an outgroup, as suggested by Satoh [26]. The trees of mammalian class I OR genes were rooted by a class II human OR gene (OR2T4, GenBank accession no. NM_001004696), and class II trees by a class I human OR gene (OR51M1, GenBank accession no. NM_001004756). The aligned sequence data analyzed in this study are available as supporting Text S3, S4, S5, S6, S7, S8, S9, and Text S10.
3. Estimation of the number of ancestral OR genes
Every multigene phylogenetic tree consisting of two species (sp.1 and sp.2) can be resolved into three types of phylogenetic subtrees, if genes derived from intraspecific duplications are considered to be one gene (Fig. 4(a)). For example, the imaginary tree shown in Fig. 4(b) can be resolved into 2 type-A subtrees, 1 type-B subtree and 1 type-C subtree. Here, the number of subtrees is denoted by a for type-A, b for type-B and c for type-C. The set of sp.1-sp.2 LCA genes is denoted by G0. Subsets of G0 passed on to sp.1 or sp.2 are denoted by G1 and G2. Then, the following equations hold (|G| is the number of elements of the set G, and G1c is the complement of G1):(1)(2)(3)(4)(5)
Genes derived from intraspecific duplications are considered to be one gene. Type-A indicates that a gene is found in both sp.1 and sp.2 lineages. Type-B indicates that orthologous genes of a gene of sp.1 are not found in sp.2. Type-C indicates that orthologous genes of a gene of sp.2 are not found in sp.1. (a) An imaginary tree of a multigene family in two species (sp.1 and sp.2). It can be resolved into two type-A subtrees, one type-B subtree and one type-C subtree.
On the assumption that genes in the lineages leading to sp.1 or sp.2 evolve independently, namely, subset G1 and G2 are independent from each other, the following equation is obtained:(6)
If x is defined as the number of LCA genes ( = |G0|), the following equation is derived from Eq. 1 and Eq. 2:(7)
Then Eq. 6 can be expressed in terms of a, b, c and x using Eq. 2, Eq. 4, Eq. 5 and Eq. 7:(8)
Finally, the following equation is obtained by solving the equation Eq. 8:(9)
Eq. 9 means that the number of LCA genes can be estimated by counting the number of type-A, B and C subtrees. Eq. 9 can be expanded as follows:(9')
The value a+b+c stands for the number of G0 genes which are remaining in G1 and/or G2 genomes, and according to Eq. 9', the value bc/a is revealed to stand for the estimated number of G0 genes which were lost in both G1 and G2 lineages.
Some sources of potential errors, however, should be noted in the estimation of the value of x in Eq. 9. Alternative gene loss in sp.1 and sp.2 between two adjacent subtrees, concerted evolution [27] or some positive correlations between G1 and G2 might lead to underestimation of the number of LCA genes.
Supporting Information
Text S1.
Opossum OR database obtained in this study
https://doi.org/10.1371/journal.pone.0002385.s005
(1.45 MB TXT)
Text S2.
Supporting materials and methods
https://doi.org/10.1371/journal.pone.0002385.s006
(0.02 MB TXT)
Text S3.
The OR sequences aligned between frog and mouse
https://doi.org/10.1371/journal.pone.0002385.s007
(1.13 MB TXT)
Text S4.
The OR sequences aligned between chicken and mouse
https://doi.org/10.1371/journal.pone.0002385.s008
(0.94 MB TXT)
Text S5.
The class I OR sequences aligned between platypus and mouse
https://doi.org/10.1371/journal.pone.0002385.s009
(0.07 MB TXT)
Text S6.
The class II OR sequences aligned between platypus and mouse
https://doi.org/10.1371/journal.pone.0002385.s010
(0.84 MB TXT)
Text S7.
The class I OR sequences aligned between opossum and mouse
https://doi.org/10.1371/journal.pone.0002385.s011
(0.14 MB TXT)
Text S8.
The class II OR sequences aligned between opossum and mouse
https://doi.org/10.1371/journal.pone.0002385.s012
(0.99 MB TXT)
Text S9.
The class I OR sequences aligned between dog and mouse
https://doi.org/10.1371/journal.pone.0002385.s013
(0.14 MB TXT)
Text S10.
The class II OR sequences aligned between dog and mouse
https://doi.org/10.1371/journal.pone.0002385.s014
(1.24 MB TXT)
Acknowledgments
I thank Dr. Elizabeth Nakajima for checking the English text; Dr. Daisuke Hoshiyama, Dr. Yoshihisa Shirayama, Dr. Shin Kubota and all members of the Seto Marine Biological Laboratory, Kyoto University for helpful comments.
Author Contributions
Conceived and designed the experiments: TK. Performed the experiments: TK. Analyzed the data: TK. Contributed reagents/materials/analysis tools: TK. Wrote the paper: TK.
References
- 1. Buck L, Axel R (1991) A novel multigene family may encode odorant receptors: a molecular basis for odor recognition. Cell 65: 175–187.
- 2. Firestein S (2001) How the olfactory system makes sense of scents. Nature 413: 211–218.
- 3. Niimura Y, Nei M (2006) Evolutionary dynamics of olfactory and other chemosensory receptor genes in vertebrates. J Hum Genet 51: 505–517.
- 4. Ache BW, Young JM (2005) Olfaction: diverse species, conserved principles. Neuron 48: 417–430.
- 5. Aloni R, Olender T, Lancet D (2006) Ancient genomic architecture for mammalian olfactory receptor clusters. Genome Biol 7: R88.
- 6. Niimura Y, Nei M (2007) Extensive gain and losses of olfactory receptor genes in mammalian evolution. PLoS ONE 2: e708.
- 7. Rouquier S, Blancher A, Giorgi D (2000) The olfactory receptor gene repertoire in primates and mouse: evidence for reduction of the functional fraction in primates. Proc Natl Acad Sci U S A 97: 2870–2874.
- 8. Gilad Y, Wiebe V, Przeworski M, Lancet D, Pääbo S (2004) Loss of olfactory receptor genes coincides with the acquisition of full trichromatic vision in primates. PLoS Biol 2: 120–125.
- 9. Kishida T, Kubota S, Shirayama Y, Fukami H (2007) The olfactory receptor gene repertoires in secondary-adapted marine vertebrates: evidence for reduction of the functional proportions in cetaceans. Biol Lett 3: 428–430.
- 10. Rouquier S, Taviaux S, Trask BJ, Brand-Arpon V, van den Engh G, et al. (1998) Distribution of olfactory receptor genes in the human genome. Nat Genet 18: 243–250.
- 11. Glusman G, Yanai I, Rubin I, Lancet D (2001) The complete human olfactory subgenome. Genome Res 11: 685–702.
- 12. Davies WL, Carvalho LS, Cowing JA, Beazley LD, Hunt DM, et al. (2007) Visual pigments of the platypus: a novel route to mammalian colour vision. Curr Biol 17: R161–R163.
- 13. Suga H, Kuma K, Iwabe N, Nikoh N, Ono K, et al. (1997) Intermittent divergence of the protein tyrosine kinase family during animal evolution. FEBS Lett 412: 540–546.
- 14. Suga H, Ono K, Miyata T (1999) Multiple TGF-ß receptor related genes in sponge and ancient gene duplications before the parazoan-eumetazoan split. FEBS Lett 453: 346–350.
- 15. Collin SP, Trezise AEO (2004) The origins of colour vision in vertebrates. Clin Exp Optom 87: 217–223.
- 16. Bowmaker JK, Hunt DM (2006) Evolution of vertebrate visual pigments. Curr Biol 16: R484–R489.
- 17. Arrese CA, Hart NS, Thomas N, Beazley LD, Shand J (2002) Trichromacy in Australian marsupials. Curr Biol 12: 657–660.
- 18. Glusman G, Bahar A, Sharon D, Pilpel Y, White J, et al. (2000) The olfactory receptor gene superfamily: data mining, classification, and nomenclature. Mammalian Genome 11: 1016–1023.
- 19. Keller A, Zhuang H, Chi Q, Vosshall LB, Matsunami H (2007) Genetic variation in a human odorant receptor alters odour perception. Nature 449: 468–472.
- 20. Pearson WR, Wood T, Zhang Z, Miller W (1997) Comparison of DNA sequences with protein sequences. Genomics 46: 24–36.
- 21. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 193–225.
- 22. Niimura Y, Nei M (2005) Evolutionary dynamics of olfactory receptor genes in fishes and tetrapods. Proc Natl Acad Sci U S A 102: 6039–6044.
- 23. Olender T, Feldmesser E, Atarot T, Eisenstein M, Lancet D (2004) The olfactory receptor universe--from whole genome analysis to structure and evolution. Genet Mol Res 3: 545–53.
- 24. Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85: 2444–2448.
- 25. Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30: 3059–3066.
- 26. Satoh G (2005) Characterization of novel GPCR gene coding locus in amphioxus genome: gene structure, expression, and phylogenetic analysis with implication for its involvement in chemoreception. Genesis 41: 47–57.
- 27. Nei M, Rooney AP (2005) Concerted and birth-and death evolution of multigene families. Annu Rev Genet 39: 121–152.
- 28. Felsenstein J (1985) Confidence limits on phylogenetics: an approach using the bootstrap. Evolution 39: 783–791.
- 29. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4: 406–425.
- 30.
Nei M, Kumar S (2000) Molecular evolution and phylogenetics. New York: Oxford University Press.
- 31. Grus WE, Shi P, Zhang J (2007) Largest vertebrate vomeronasal type 1 receptor gene repertoire in the semiaquatic platypus. Mol Biol Evol 24: 2153–2157.
- 32. Young JM, Friedman C, Williams EM, Ross JA, Tonnes-Priddy L, et al. (2002) Different evolutionary processes shaped the mouse and human olfactory receptor gene families. Hum Mol Genet 11: 535–546.