Skip to content

Codon frequency estimation

Kenji Fukushima edited this page Feb 21, 2026 · 4 revisions

Overview

CSUBST requires codon equilibrium frequencies for codon-model analyses. It uses codon frequencies reported by IQ-TREE when available, and falls back to alignment-based empirical estimation when they are missing in some IQ-TREE 3 codon outputs.

Frequency source priority

  1. Parse codon pi(...) values from the IQ-TREE .iqtree report.
  2. If codon pi(...) values are unavailable in IQ-TREE 3 output, estimate frequencies from the input codon alignment.

Alignment-based empirical estimation

When fallback estimation is used, CSUBST:

  • reads all codons from the input in-frame alignment,
  • converts U to T,
  • skips missing or undefined codons,
  • expands IUPAC ambiguous symbols into compatible codons,
  • splits ambiguous counts equally across compatible codons, and
  • normalizes counts so the total frequency is 1.

Notes

  • This fallback keeps codon-model workflows robust when IQ-TREE 3 output does not include codon pi(...) entries.
  • If codon frequency parsing fails in IQ-TREE 2 output, CSUBST raises an error instead of using this fallback.

References

Clone this wiki locally