Fig 1.
Scenarios of retrovirus evolution in cetaceans.
(A) The land-to-water transition (LTW) scenario. A retrovirus infected and became integrated in the ancestor of cetaceans before or during the conquest of aquatic environment and transitioned into water with their ancient cetacean hosts. (B) The secondary host switching (SHS) scenario. Retroviruses infected and became endogenized in cetaceans through cross-species transmission from diverse sources after cetaceans became fully aquatic. (C) The distribution and expected phylogenetic pattern of cetacean ERVs under two scenarios. Under the LTW scenario, the ERV should be identified in the genomes of nearly all the cetaceans. Most of the cetacean ERVs are expected to be closely related to H. amphibious ERVs, while others are most closely related to artiodactyla (except the H. amphibious). Under the SHS scenario, the ERV should only be identified in a sub-lineage of cetaceans. The ERV is not expected to be closely related to ERVs from any certain vertebrate species. The phylogenetic relationships of cetaceans are based on TimeTree [64] and literatures [65, 66]. Illustrations of mysticetes, odontocetes, and Pakicetus courtesy by Chris Huh, Chris Huh, and Conty, respectively.
Fig 2.
The copy numbers of distinct cetacean ERV lineages.
(A), (B), and (C) show the copy numbers of ERVs in the LTW lineages that belong to Class I ERVs (gammaretrovirus), Class I ERVs (epsilonretrovirus), and Class III ERVs, respectively. (D) and (E) show the copy numbers of ERVs in the SHS lineages that belong to Class I ERVs (gammaretrovirus) and Class II ERVs, respectively. Phylogenetic relationship of cetaceans is shown on the left. Odontocetes and mysticetes are highlighted in blue and red, respectively.
Fig 3.
Potential sources of cetacean ERV lineages.
(A) An overview of potential sources of all the 315 cetacean ERV lineages. Boxes in blue and orange indicate the numbers of ERVs under the land-to-water transition scenario and under the secondary host switching scenario, respectively. (B to E) Potential sources of cetacean ERV lineages that belong to Class I (gammaretrovirus), Class I (epsilonretrovirus), Class III, and Class II ERVs. Boxes in blue and orange indicate the numbers of ERVs under the LTW scenario and under the SHS scenario, respectively. Illustrations of Artiodactyla, Tragulidae, and Monodelphis domestica courtesy by Zimices, Zimices, and Sarah Werning, respectively.
Fig 4.
ERV orthologous insertions in cetaceans.
(A) Examples of orthologous insertions for the SHS and the LTW ERV lineages. The phylogenetic relationship of cetaceans is shown on the left. The dotted boxes in red and blue show orthologous ERV insertions for the SHS and the LTW scenarios, respectively. The white and mauve rectangles represent 500 bp sequences flanking complete ERVs. The deep purple rectangles represent complete ERVs. Illustrations of cetacean species courtesy by Chris Huh. (B) ERV orthologous insertions between cetaceans and H. amphibious. Rectangles from left to right represent 1,000 bp flanking sequence, 5’-LTR, internal genes of an ERV, 3’-LTR, and 1,000 bp flanking sequence, respectively. Dashed boxes indicate missing of the corresponding regions. (C) ERV orthologous insertions between mysticetes and odontocetes. Rectangles from left to right represent 500 bp flanking sequence, ERV, and 500 bp flanking sequence, respectively. Dashed boxes indicate missing of the corresponding regions. (D) ERV and cetacean phylogeny congruence test. The event cost scheme (0, 1, 2, 1, 1) is for cospeciation, duplication, duplication with host switch, loss, and failure to diverge, respectively.
Fig 5.
Evolutionary dynamics of cetacean ERVs.
(A) Distribution of the genetic distance between 5’- and 3’ LTRs of complete ERVs in 315 cetacean ERV lineages. Blue and orange lines represent the distribution of the genetic distance between 5’- and 3’ LTRs of complete ERVs in the LTW and the SHS ERV lineages, respectively. The purple and green boxes represent 95% highest posterior distributions (HPD) of the genetic distance of introns between B. acutorostrata and H. amphibious (reflecting the divergence between cetaceans and hippopotamuses) and between B. acutorostrata and O. orca (reflecting the divergence between mysticetes and odontocetes) with dashed lines as the means. (B) Distribution of the genetic distance between 5’- and 3’- LTRs of complete ERVs in different cetacean ERV lineages under the SHS scenario. The purple and green boxes represent 95% HPD of the genetic distance of introns between B. acutorostrata and H. amphibious and between B. acutorostrata and O. orca, respectively, with dashed lines as the means.
Fig 6.
Selection pressure on the genes of cetacean ERVs.
The dN/dS ratio values of three retroviral genes (gag, pol, and env) of the SHS ERV lineages in a certain species are shown. NA indicates not applicable, because information is not enough for the dN/dS calculation. × represents the loss of the corresponding gene.