Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

Figure 5

Bayesian integration is biased in a simplified Bayesian simulation.

For each replicate, a topology/brach-length combination was chosen from a discrete set of sixteen, each with equal probability. There are two possible topologies (AB,CD) and (AC,BD); for each, there are four combinations of long (0.75 substitutions/site) and short (0.01) terminals, and two internal branch lengths (0.1 or 0.001, not shown) for each combination of terminal lengths. For each replicate, an ideal dataset with the expected state pattern frequencies was generated given the topology and branch lengths. When these data are analyzed using BI, with the true uniform distribution over the true set of topology/branch-length combinations used as a prior, the topology noted next to each tree is inferred as the maximum a posteriori phylogeny with support >0.99. Bold text indicates incorrect inferences; regular text, correct inferences. The chart shows the proportion of inferences from which each topology is recovered by BI and ML, along with the fraction of those inferences that are correct.

