Fig 1.
Illustrating the “growth” of lineages of a gene tree in a phylogenetic network.
The histories of green and red alleles are shown as solid (green) lines and dashed (red) lines, respectively.
Fig 2.
An illustration of the decompose-and-split operation.
In this example, partial likelihood is decomposed into six vectors F0 to F5. An illustration of how F4 is split in the four possible ways to trace branches y and z is shown, and every split is assigned a unique label.
Fig 3.
The two model phylogenetic networks used to generate the simulated data sets.
The branch lengths of the phylogenetic networks are measured in units of expected number of mutations per site (scale is shown). The inheritance probabilities are marked in blue. Both networks are based on the same “backbone” tree: Removing the R→Q reticulation edge in (A) and the C→A and R→Q reticulation edges in (B) gives rise to the tree (C,(R,(L,(A,Q)))). The hybridization events in the panel can be viewed as involving pairs of branches of this tree: (A) The hybridization is from R to Q. (B) One hybridization is from R to Q and another is from C to A.
Fig 4.
The ratio of trees (blue), 1-reticulation networks (green), 2-reticulation networks (black), and 3-reticulation networks (purple) sampled under different simulation settings.
Top row: The true network is the 1-reticulation network in Fig 3(A). Bottom row: The true network is the 2-reticulation network in Fig 3(B). Left column: The correct prior hyperparameters for the population mutation rate were used. Right column: The incorrect prior hyperparameters for the population mutation rate were used.
Fig 5.
The topological distance (pink) between sampled networks and true network, and the Robinson-Foulds distance (orange) between sampled trees and true backbone tree, under different simulation settings.
Top row: The true network is the 1-reticulation network in Fig 3(A). Bottom row: The true network is the 2-reticulation network in Fig 3(B). Left column: The correct prior hyperparameters for the population mutation rate were used. Right column: The incorrect prior hyperparameters for the population mutation rate were used. The samples underlying the orange points correspond to those underlying the blue bars in Fig 4, whereas the samples underlying the pink points correspond to all other samples in Fig 4.
Fig 6.
Histograms of the branch lengths sampled by our method on the simulated data set corresponding to the phylogenetic network of Fig 3(A).
Blue: 1,000 sites. Green: 10,000 sites. Black: 100,000 sites. Purple: 1,000,000 sites. The red dashed lines correspond to the true values.
Fig 7.
Histograms of the population mutation rates sampled by our method for each of the branches on the simulated data set corresponding to the phylogenetic network of Fig 3(A).
Blue: 1,000 sites. Green: 10,000 sites. Black: 100,000 sites. Purple: 1,000,000 sites. The red dashed lines correspond to the true values.
Fig 8.
A histogram of the inheritance probabilities sampled by our method on the simulated data set corresponding to the phylogenetic network of Fig 3(A).
Blue: 1,000 sites. Green: 10,000 sites. Black: 100,000 sites. Purple: 1,000,000 sites. The red dashed line corresponds to the true values.
Fig 9.
Histograms of the branch lengths sampled by our method on the simulated data set corresponding to the phylogenetic network of Fig 3(A).
The five curves correspond to five independent runs. The red dashed lines correspond to the true values.
Fig 10.
Histograms of the population mutation rates sampled by our method for each of the branches on the simulated data set corresponding to the phylogenetic network of Fig 3(A).
The five curves correspond to five independent runs. The red dashed lines correspond to the true values.
Fig 11.
A histogram of the inheritance probabilities sampled by our method on the simulated data set corresponding to the phylogenetic network of Fig 3(A).
The five curves correspond to five independent runs. The red dashed line corresponds to the true value.
Fig 12.
The phylogenetic network used to investigate effect of multiple individuals.
The branch lengths of the phylogenetic networks are measured in units of expected number of mutations per site. The inheritance probabilities are marked in blue.
Fig 13.
Histograms of the branch lengths sampled by our method on the simulated data set corresponding to the phylogenetic network of Fig 12.
In all cases, a single diploid individual was sampled from A, B, and D. Blue: A single diploid individual is sampled from C. Green: Four diploid individuals are sampled from C. The red dashed lines correspond to the true values.
Fig 14.
Histograms of the population mutation rates sampled by our method for each of the branches on the simulated data set corresponding to the phylogenetic network of Fig 12.
In all cases, a single diploid individual was sampled from A, B, and D. Blue: A single diploid individual is sampled from C. Green: Four diploid individuals are sampled from C. The red dashed lines correspond to the true values.
Fig 15.
A histogram of the inheritance probabilities sampled by our method on the simulated data set corresponding to the phylogenetic network of Fig 12.
In all cases, a single diploid individual was sampled from A, B, and D. Blue: A single diploid individual is sampled from C. Green: Four diploid individuals are sampled from C. The red dashed lines correspond to the true value.
Fig 16.
The height of trees and networks sampled under different simulation settings and violations in the different assumptions.
The red dashed lines correspond to the true values. In each panel at most one condition is violated. (a) Mean of 1.0 is used for the Poisson prior on the number of reticulations. (b) Mean of 3.0 is used for the Poisson prior on the number of reticulations. (c) Linked loci: 10 sites are generated per gene tree. (d) Linked loci: 100 sites are generated per gene tree. (e) Rate variation across lineages with 0.1 of invariable sites and 3.0 as shape of gamma rate heterogeneity. (f) Rate variation across lineages with 0.2 of invariable sites and 5.0 as shape of gamma rate heterogeneity. (g) Rate variation across markers with 0.1 of invariable sites and 3.0 as shape of gamma rate heterogeneity. (h) Rate variation across markers with 0.2 of invariable sites and 5.0 as shape of gamma rate heterogeneity.
Fig 17.
The ratio of trees (blue) and 1-reticulation networks (green) sampled under different simulation settings and violation in the different assumptions.
The true number of reticulations is 1. In each panel at most one condition is violated. (a) Mean of 1.0 is used for the Poisson prior on the number of reticulations. (b) Mean of 3.0 is used for the Poisson prior on the number of reticulations. (c) Linked loci: 10 sites are generated per gene tree. (d) Linked loci: 100 sites are generated per gene tree. (e) Rate variation across lineages with 0.1 of invariable sites and 3.0 as shape of gamma rate heterogeneity. (f) Rate variation across lineages with 0.2 of invariable sites and 5.0 as shape of gamma rate heterogeneity. (g) Rate variation across markers with 0.1 of invariable sites and 3.0 as shape of gamma rate heterogeneity. (h) Rate variation across markers with 0.2 of invariable sites and 5.0 as shape of gamma rate heterogeneity.
Fig 18.
The topological distance (pink) between sampled networks and true network, and the Robinson-Foulds distance (orange) between sampled trees and true backbone tree, under different simulation settings and violation in the different assumptions.
In each panel at most one condition is violated. (a) Mean of 1.0 is used for the Poisson prior on the number of reticulations. (b) Mean of 3.0 is used for the Poisson prior on the number of reticulations. (c) Linked loci: 10 sites are generated per gene tree. (d) Linked loci: 100 sites are generated per gene tree. (e) Rate variation across lineages with 0.1 of invariable sites and 3.0 as shape of gamma rate heterogeneity. (f) Rate variation across lineages with 0.2 of invariable sites and 5.0 as shape of gamma rate heterogeneity. (g) Rate variation across markers with 0.1 of invariable sites and 3.0 as shape of gamma rate heterogeneity. (h) Rate variation across markers with 0.2 of invariable sites and 5.0 as shape of gamma rate heterogeneity.
Fig 19.
The MAP phylogenetic network for the subset with the hybrid O. × cockayneana (Meudt 175a, MPN 29710) and putative parents.
The width of each tube is proportional to the population mutation rate of each branch, which is printed on each tube. The length of each tube is proportional to the length of the corresponding branch in units of expected number of mutations per site (scale shown). Blue arrows indicate the reticulation edges and their inheritance probabilities are printed in blue.
Fig 20.
The MAP phylogenetic network for the subset with the hybrid O. × prorepens (Meudt 203a, MPN 29774) and putative parents.
The width of each tube is proportional to the population mutation rate of each branch, which is printed on each tube. The length of each tube is proportional to the length of the corresponding branch in units of expected number of mutations per site (scale shown). Blue arrows indicate the reticulation edges and their inheritance probabilities are printed in blue.