FoldBeast enables structural phylogenetic inference using the 3Di structural alphabet (see review: Puente-Lelievre et al. 2025). Currently, there are three matrices available: the original Foldseek 3Di matrix (van Kempfen et al. 2023), and the GH AlphaFold and LLM matrices (Garg and Hochberg 2025). This package enables inference in all three matrices, as well as model averaging.
If you find this package helpful for your research, please cite our preprint where we apply this method to aminoacyl-tRNA synthetases (Douglas & Bromham 2025).
FoldBeast is currently in pre-release.
- Launch BEAUti
- Click on
File -> Manage Packages - Install FoldBeast. If FoldBeast is not in the list of packages, you may need to add an extra package repository as follows:
- Click the packager repositories button. A dialog pops up.
- Click the Add URL button. A dialog is shown where you can enter https://raw.githubusercontent.com/CompEvol/CBAN/master/packages-extra-2.7.xml
- click the OK button. There should be an extra entry in the list.
- click Done
- After a short delay, FoldBeast should appear in the list of packages.
This package requires BEAST 2.7 or newer. To follow this tutorial, the following BEAST 2 packages should be installed. This can be done by opening BEAUti and then File -> Manage Packages.
- ORC -- the optimised relaxed clock, which we will will use as a molecular clock model.
- OBAMA -- an amino acid model averaging framework, for estimating the amino acid substitution model.
In this tutorial, we will configure a BEAST 2 analysis from an amino acid and 3Di partition of the same dataset. Both partitions will share a phylogeny, however they will have their own site and clock models.
-
Launch BEAUti.
-
Load the two
fastaalignment files in theexamples/folder. -
Select amino acid as the data type for
crimvlg_aaand select 3Di as the datatype forcrimvlg_3di. This alignment is an anticodon binding domain from eight aminoacyl-tRNA synthetase families. -
Link the two partitions into the same tree, but let them have their own clock and site models.
- Open the
Site Modeltab. - To estimate the amino acid site and substitution model, select the
OBAMA Bayesian Model Averagingmodel for thecrimvlg_aapartition. - To estimate the 3Di site and substitution model, select the
Fold Beast 3Di Model Averagingmodel for thecrimvlg_3dipartition. This will compare the four models described at the top of this page, plus a "null model" where all exchangeability rates are equal. If this model is chosen, there may be something wrong with the analysis, for example amino acids may have been uploaded instead of 3Di characters. - To estimate the relative rate of the two partitions, tick the
Estimatebox next toMutation Rateon either partition.
- Open the
Clock Modeltab and select theOptimised Relaxed Clockfor either partition. - Open the
Priorstab and make any desired adjustments to the tree prior and other priors, as per usual. - Optional: ancestral sequence reconstruction, and estimating the number of amino acid and 3Di substitutions per-lineage, can be configured with the BeastMap package
- Save the XML file and run BEAST 2, as per usual.
Douglas, J., & Bromham, L. (2025). Reconstructing substitution histories on phylogenies, with accuracy, precision, and coverage. bioRxiv, 2025-12. https://doi.org/10.64898/2025.12.21.695861
Puente-Lelievre, C., Malik, A., & Douglas, J. (2025). Protein Structural Phylogenetics. Genome Biology and Evolution, 17(8), evaf139.
van Kempen, Michel, et al. "Fast and accurate protein structure search with Foldseek." Nature Biotechnology (2023): 1-4.
Garg, S. G., & Hochberg, G. K. (2025). A general substitution matrix for structural phylogenetics. Molecular Biology and Evolution, 42(6), msaf124.
Beast user forums https://groups.google.com/g/beast-users
Or email Jordan Douglas: jordan.douglas@auckland.ac.nz

