Comparative analysis estimates the relative frequencies of co-divergence and cross-species transmission within viral families

doi:10.1371/journal.ppat.1006215

Fig 1.

Tanglegrams of phylogenetic trees created using simulated data.

Lines connect the virus with its respective host. Hence, if viruses and hosts have congruent phylogenies—indicative of strong virus-host co-divergence—then there will obviously be more horizontal than diagonal lines. Panel (A) illustrates a perfectly matched topology between virus and host trees and thus the nPH85 = 0. Panel (B) exemplifies an entirely mismatched topology between virus and host trees, where the nPH85 = 1. Data from viruses in nature will fall between these two extremes. Panels (C) and (D) illustrate two examples where the host trees have one incongruent node. Panel (C) corresponds to a shallower section of the tree than in panel (D), but the two nPH85 are the same, such that the position of the incongruence does not produce a systematic bias. Panel (E) elucidates the relationship between the nPH85 distance and the number of incongruent nodes between a pair of simulated trees with 100 tips.

More »

Expand

Fig 2.

Overall normalized topological distance between two unrooted phylogenetic trees for each virus family by normalizing the Penny and Hendy [14] metric (i.e. nPH85).

A range of DNA (blue) and RNA (yellow) virus families are shown. If nPH85 = 0, it is indicative of virus-host co-divergence, while nPH85 = 1 suggests frequent cross-species transmission (red). For ease of interpretation virus families are ranked by descending frequency of cross-species transmission.

More »

Expand

Fig 3.

Relative node depths of incongruences between host and virus phylogenies showing the median and 25^th and 75^th percentiles (boxplots) as well as the raw data.

A relative node depth close to 0 can be interpreted as the occurrence of host-switching events at the tips of the phylogenetic tree, whereas a relative node depth close to 1 suggests host-switching events at the root of the phylogenetic tree. A range of DNA (blue) and RNA (yellow) virus families are shown. For ease of interpretation virus families are ranked as in Fig 2.

More »

Expand

Fig 4.

Tanglegrams of rooted phylogenetic trees for each virus family.

Host trees were rooted first following their known phylogenetic history, with virus trees then rooted based on the host tree. The ‘untangle’ function was used to maximize the congruence between the host and virus phylogenies. Lines that connect the host (left) with its virus (right) are colored according to the host type (dark blue: mammals; light green: birds; light blue: reptiles and amphibians; red: fish; pink: invertebrates; dark green: plants). Phylogenies with the individual tip labels visible are shown in S1 Fig.

More »

Expand

Fig 5.

(A) Reconciliation analysis of each virus family using Jane [15]. Boxplots illustrate the range of the proportion of possible events. The ‘event costs’ associated with incongruences between trees were conservative towards co-divergence and defined here as: 0 for co-divergence, 1 for duplication, 1 for host-jumping and 1 for extinction. Virus families are ranked in order of highest mean co-divergence to lowest mean co-divergence. Abbreviations on the x-axis are as follows: ‘Co-div’ = co-divergence, ‘Dup’ = duplication, ‘HJ’ = host-jumping, ‘Ext’ = extinction. (B) Reconciliation of the Hepadnaviridae phylogeny with that of their vertebrate hosts, again utilizing the co-phylogenetic method implemented in Jane [15]. The figure illustrates all possible co-divergence, extinction and host-jumping events (no lineage duplication events were reconstructed in this case).

More »

Expand

Fig 6.

(A) The nPH85 distance as a function of the number of viruses per virus family. Pearson’s correlation coefficient, R, was found to be statistically significant (p<0.005). (B) nPH85 distances by genome type showing the median (horizontal line) and 25^th and 75^th percentiles. A t-test showed that the difference between these distances was significant (p<0.05). As before, a range of DNA (blue) and RNA (yellow) virus families are shown.

More »

Expand

Table 1.

Summary of the virus data used in this study.

The best-fit amino acid substitution models were selected according to the Bayesian Information Criterion.

More »

Expand