-
-
Notifications
You must be signed in to change notification settings - Fork 69
Description
I conducted an experiment to evaluate the variance in phylogenetic trees reconstructed from a gene. I observed that trees reconstructed from two files with identical MSAs but different sequence orders produced different trees. This was assessed using tree distance metrics and visual inspection. Despite setting a constant seed (via --seed), the two files still resulted in distinct trees. Additionally, when using the same input file (therefore, the same sequence order) and the same seed, the resulting trees were still different.
The same substitution model was used across all tests:
iqtree2 -T 8 -s test_msa.afa -m WAG+G4 --prefix test_tree --nstop 50 --seed 8787Using a single thread (-nt 1) resulted in identical trees when the input was identical. However, shuffling the order of the sequences in the input FASTA led to distinct trees.
I could not find any documentation indicating that using multiple threads would make results irreproducible or that changing the sequence order in the input file would lead to different results, regardless of the seed and number of threads. Am I missing something?
Version:
IQ-TREE multicore version 2.3.4 COVID-edition for Linux x86 64-bit built Apr 26 2024