Skip to main content
Advertisement

< Back to Article

Figure 1.

Lineage sorting within the branches of a species tree.

Even though C and D diverged from their most recent common ancestor at time T1, going back in time one observes that their gene lineage (solid lines) persisted further in the past and coalesced at time t′, which preceded the speciation time T2. In this scenario, the gene lineages from B and D happened to coalesce at time T2, after t′, thus resulting in gene tree (A, (C, (B, D))) that disagrees with the species tree (A, (B, (C, D))).

More »

Figure 1 Expand

Figure 2.

Approaches for inferring species trees.

In the combined analysis approach (top), the sequences of the four loci are concatenated, generating one sequence data set, which is then analyzed by any of a host of phylogenetic tree reconstruction methods. In the separate analysis approach (bottom), a gene tree is reconstructed for each locus, and a species tree that reconciles their incongruence is inferred.

More »

Figure 2 Expand

Figure 3.

The species tree for the Apicomplexan data as inferred using the majority consensus method and reported in [4].

The species Tt (Tetrahymena thermophila) is the outgroup. The numbers on the tree branches are bootstrap support values based on maximum likelihood, maximum parsimony and neighbor joining methods, respectively.

More »

Figure 3 Expand

Figure 4.

The species tree for the yeast data set as inferred using the concatenation method and reported in [1].

All branches in the tree have 100% bootstrap support values.

More »

Figure 4 Expand

Figure 5.

Optimal and sub-optimal trees inferred under the MDC criterion for the Apicomplexan data set.

A The optimal (species) tree inferred by our method for the Apicomplexan data set; this tree requires 440 deep coalescences to reconcile all 268 gene trees. The two sub-optimal species trees with 469 and 542 deep coalescences are shown in B and C, respectively. The value on each branch is the numbers of extra lineages within that branch, when reconciling all 268 gene trees.

More »

Figure 5 Expand

Figure 6.

Plot of the number of extra lineages for each of the binary (fully resolved) 247 species tree candidates identified as maximal cliques in the compatibility graph of the gene tree clusters.

The first three lowest values are 440, 469 and 542. The trees corresponding to these numbers are shown in Figure 5, respectively.

More »

Figure 6 Expand

Figure 7.

The only two gene trees of the Apicomplexan data set that do not have the cluster (Pv, Pf).

A The coalescence process, as inferred by MDC, for gene tree (((Ta, Bb), ((((Tg, Et), Cp), Pv), Pf)), Tt). B The coalescence process, as inferred by MDC, for gene tree ((((((Ta, Bb), (Tg, Et)), Cp), Pv), Pf), Tt).

More »

Figure 7 Expand

Figure 8.

Reconciliations of the two gene trees in Figure 7 and the species tree in Figure 5A assuming HGT as the source of incongruence.

A The reconciliation scenario for the gene tree (((Ta, Bb), ((((Tg, Et), Cp), Pv), Pf)), Tt). B The reconciliation for the gene tree ((((((Ta, Bb), (Tg, Et)), Cp), Pv), Pf), Tt).

More »

Figure 8 Expand

Figure 9.

The species tree inferred by our method for the yeast data set.

A The tree topology and the number of extra lineages, under the optimal reconciliation, for each of its branches. B Plot of the number of extra lineages for all 48 species tree candidates.

More »

Figure 9 Expand

Figure 10.

The six best sub-optimal trees for the yeast data set.

These trees, from left to right and top down, have in total 134, 163, 170, 186, 191 and 193 extra lineages. The values on the branches are the numbers of extra lineages within them.

More »

Figure 10 Expand

Figure 11.

Analysis of the synthetic data sets.

A The average percentage of clusters induced by species trees that are not found in the set of clusters induced by gene trees. The x-axis indicates the number of sampled gene trees. The results are based on the simulated data. B The performance of our method on the simulated data. The x-axis indicates the number of sampled gene trees. The y-axis is the average Robinson-Foulds distance between the species tree and the tree inferred by our method.C The difference in the number of extra lineages of the true species tree and that number for the inferred optimal tree.

More »

Figure 11 Expand

Figure 12.

A case in which the optimal tree under the MDC criterion contains at least one cluster that does not occur in any of the input gene trees.

Three gene trees over the taxon-set {a, b, c, d, e}. The tree that minimizes the total number of extra lineages and that consists of only clusters induced by those three trees is the leftmost one. It requires seven extra lineages to reconcile all three gene trees.

More »

Figure 12 Expand

Figure 13.

The compatibility graph that is built from clusters induced by the gene trees in Figure 12.

Each vertex of the graph corresponds to a cluster (a string next to it), and two vertices are adjacent if the two clusters they represent are compatible. The number following ‘/’ in a vertex label is the total number of extra lineages contributed by the cluster corresponding to that vertex.

More »

Figure 13 Expand

Figure 14.

A tree that requires six extra lineages to reconcile the three gene trees in Figure 12.

More »

Figure 14 Expand

Figure 15.

Fitting a gene tree T into a species tree T′.

Here, only mappings of internal nodes of T are shown.

More »

Figure 15 Expand

Figure 16.

Illustration of our method.

A A weighted compatibility graph is constructed from the clusters of the input gene trees (T1, T2, and T3). Shown at the bottom are all maximal cliques, along with their weights (the sum of weights of their vertices), of which the three heaviest maximal cliques are highlighted. B A table showing the calculation of the weight of each vertex in the compatibility graph, where in each row v is the vertex that corresponds to the cluster in that row, and w(v) = m+1−(α(C, T1)+α(C,T2)+α(C,T3)) (m = 2 in this case).

More »

Figure 16 Expand