Skip to main content

Advertisement

PLOS Computational Biology

Browse
Publish
- Submissions
- Policies
- Manuscript Review and Publication
About

Search Search

advanced search

< Back to Article

Fig 1.

Branch rate parameterisations.
Top left: the prior density of a branch rate r under a Log-normal(−0.5σ², σ) distribution (with its mean fixed at 1). The function for transforming into branch rates is depicted for real (top right), cat (bottom left), and quant (bottom right). For visualisation purposes, there are only 10 bins/pieces displayed, however in practice we use 2N − 2 bins for cat and 100 pieces for quant. The first and final quant pieces are equal to the underlying function (solid lines) however the pieces in between use linear approximations of this function (dashed lines).

More »

Fig 1.

Branch rate parameterisations.
Top left: the prior density of a branch rate r under a Log-normal(−0.5σ², σ) distribution (with its mean fixed at 1). The function for transforming into branch rates is depicted for real (top right), cat (bottom left), and quant (bottom right). For visualisation purposes, there are only 10 bins/pieces displayed, however in practice we use 2N − 2 bins for cat and 100 pieces for quant. The first and final quant pieces are equal to the underlying function (solid lines) however the pieces in between use linear approximations of this function (dashed lines).

More »

Table 1 — Table 1.

Summary of pre-existing BEAST 2 operators.

More »

Fig 2 — Fig 2.

Clock standard deviation scale operators.
The two operators above propose a clock standard deviation σ → σ′. Then, either the new quantiles are such that the rates remain constant (“New quantiles”, above) or the new rates are such that the quantiles remain constant (“New rates”). In the real parameterisation, these two operators are known as Scale and CisScale, respectively. Whereas, in quant, they are known as CisScale and Scale.

More »

Fig 3 — Fig 3.

Traversing likelihood space.
The z-axes above are the log-likelihoods of the genetic distance r × τ between two simulated nucleic acid sequences of length L, under the Jukes-Cantor substitution model [40]. Two possible proposals from the current state (white circle) are depicted. These proposals are generated by the RandomWalk (RW) and ConstantDistance (CD) operators. In the low signal dataset (L = 0.1kb), both operators can traverse the likelihood space effectively. However, the exact same proposal by RandomWalk incurs a much larger likelihood penalty in the L = 0.5kb dataset by “falling off the ridge”, in contrast to ConstantDistance which “walks along the ridge”. This discrepancy is even stronger for larger datasets and thus necessitates the use of operators such as ConstantDistance which account for correlations between branch lengths and rates.

More »

Table 2 — Table 2.

Summary of AdaptiveOperatorSampler operators and their parameters of interest (POI).

More »

Fig 4 — Fig 4.

The Bactrian proposal kernel.
The step size made under a Bactrian proposal kernel is equal to sΣ where Σ is drawn from the above distribution and s is tunable.

More »

Fig 5.

Depiction of NarrowExchange and NarrowExchangeRate operators.
Proposals are denoted by . The vertical axes correspond to node heights t. In the bottom figure, branch rates r are indicated by line width and therefore genetic distances are equal to the width of each branch multiplied by its length. In this example, the and constraints are satisfied.

More »

Fig 5.

Depiction of NarrowExchange and NarrowExchangeRate operators.
Proposals are denoted by . The vertical axes correspond to node heights t. In the bottom figure, branch rates r are indicated by line width and therefore genetic distances are equal to the width of each branch multiplied by its length. In this example, the and constraints are satisfied.

More »

Fig 6.

Screening of NER and NERw variants by acceptance rate.
Top left: comparison of NER variants with the null operator NER{} = NarrowExchange. Each operator is represented by a single point, uniquely encoded by the point stylings. The number of times each operator is proposed and accepted is compared with that of NER{}, and one-sided z-tests are performed to assess the statistical significance between the two acceptance rates (p = 0.001). This process is repeated across 300 simulated datasets. The axes of each plot are the proportion of these 300 simulations for which there is evidence that the operator is significantly better than NER{} (x-axis) or worse than NER{} (y-axis). Top right: comparison of NER and NERw acceptance rates. Each point is one NER/NERw variant from a single simulation. Bottom: relationship between the acceptance rates α of and NER{} with the clock model standard deviation σ and the number of sites L. Each point is a single simulation.

More »

Fig 6 — Fig 6.

Screening of NER and NERw variants by acceptance rate.
Top left: comparison of NER variants with the null operator NER{} = NarrowExchange. Each operator is represented by a single point, uniquely encoded by the point stylings. The number of times each operator is proposed and accepted is compared with that of NER{}, and one-sided z-tests are performed to assess the statistical significance between the two acceptance rates (p = 0.001). This process is repeated across 300 simulated datasets. The axes of each plot are the proportion of these 300 simulations for which there is evidence that the operator is significantly better than NER{} (x-axis) or worse than NER{} (y-axis). Top right: comparison of NER and NERw acceptance rates. Each point is one NER/NERw variant from a single simulation. Bottom: relationship between the acceptance rates α of and NER{} with the clock model standard deviation σ and the number of sites L. Each point is a single simulation.

More »

Table 3 — Table 3.

Summary of clock model operators introduced throughout this article.

More »

Table 4 — Table 4.

Operator configurations and the substitution rate parameterisations which each operator is applicable to.

More »

Fig 7 — Fig 7.

Protocol for optimising clock model methodologies.
Each area (detailed in Models and methods) is optimised sequentially, and the best setting from each step is used when optimising the following step.

More »

Table 5 — Table 5.

Benchmark datasets, sorted in increasing order of taxon count N.

More »

Fig 8 — Fig 8.

Round 1: Benchmarking the AdaptiveOperatorSampler operator.
Top left, top right, bottom left: each plot compares the ESS/hr (±1 standard error) across two operator configurations. Bottom right: the effect of sequence length L on operator weights learned by AdaptiveOperatorSampler. Both sets of observations are fit by logistic regression models. The benchmark datasets are displayed in Table 5. The cat and quant settings are evaluated in S1 Fig.

More »

Fig 9 — Fig 9.

Round 2: Benchmarking substitution rate parameterisations.
Top left, top right, bottom left: the adapt (real), adapt (cat), and adapt (quant) configurations were compared. Bottom right: comparison of the mean tip substitution rate ESS/hr as a function of alignment length L.

More »

Fig 10 — Fig 10.

Comparison of runtimes across methodologies.
The computational time required for a setting to sample a single state is divided by that of the nocons (cat) configuration. The geometric mean under each configuration, averaged across all 9 datasets, is displayed as a horizontal bar.

More »

Fig 11 — Fig 11.

Round 3: Benchmarking the Bactrian kernel.
The ESS/hr (±1 s.e.) under the Bactrian configuration, divided by that under the uniform kernel, is shown in the y-axis for each dataset and relevant parameter. Horizontal bars show the geometric mean under each parameter.

More »

Fig 12.

Round 4: Benchmarking the NER operators.
Top: the learned weights (left) behind the two NER operators (NER{} and ), and the relative difference between their acceptance rates α (right), are presented as functions of sequence length. Logistic and logarithmic regression models are shown, respectively. Bottom: maximum clade credibility tree of the bony fish dataset by Broughton et al. 2013 [53]. This alignment received the strongest boost from NER, likely due to its high topological uncertainty and branch rate variance. Branches are coloured by substitution rate, the y-axis shows time units, and internal nodes are labelled with posterior clade support. Tree visualised using UglyTrees [59].

More »

Fig 12 — Fig 12.

Round 4: Benchmarking the NER operators.
Top: the learned weights (left) behind the two NER operators (NER{} and ), and the relative difference between their acceptance rates α (right), are presented as functions of sequence length. Logistic and logarithmic regression models are shown, respectively. Bottom: maximum clade credibility tree of the bony fish dataset by Broughton et al. 2013 [53]. This alignment received the strongest boost from NER, likely due to its high topological uncertainty and branch rate variance. Branches are coloured by substitution rate, the y-axis shows time units, and internal nodes are labelled with posterior clade support. Tree visualised using UglyTrees [59].

More »

Fig 13 — Fig 13.

Round 5: Benchmarking the LeafAVMVN operator.
See Fig 11 caption for figure notation.

More »

Publications
PLOS Aging and Health
PLOS Biology
PLOS Climate
PLOS Complex Systems
PLOS Computational Biology
PLOS Digital Health
PLOS Ecosystems
PLOS Genetics

PLOS Global Public Health
PLOS Medicine
PLOS Mental Health
PLOS Neglected Tropical Diseases
PLOS One
PLOS Pathogens
PLOS Sustainability and Transformation
PLOS Water

Home
Blogs
Collections
Give feedback
LOCKSS

Privacy Policy
Terms of Use
Advertise
Media Inquiries
Contact

PLOS is a nonprofit 501(c)(3) corporation, #C2354500, based in California, US