Skip to content

Conversation

@recmo
Copy link

@recmo recmo commented Sep 12, 2025

As suggested by @oflatt in egraphs-good/egg#358 (comment), I'm contributing a beam-search extractor to the gym.

It's basically faster-greedy-dag, except it maintains the top K candidate solutions per class. For K=1, the solutions are more or less the same as faster-greedy-dag, for larger K they get progressively better. Beam search works well for egraph extraction because the greedy solution is close to optimal, and beam search works well for exploring the area around the greedy solution.

Performance comes primarily from FastEGraph, a compact efficient static representation of an egraph where:

  • Ids are the minimal sized integers (u16, u32, ..) that can represent the problem.
  • The storage is compressed in a way similar to 'Compressed Column Storage' for sparse matrices.
  • Data is stored in a Structure-of-Arrays format.
  • All information is available by direct indexing (as opposed to e.g. hash tables).
  • Node child-eclass-sets, parent edges, class min-cost, etc are precomputed.

TODO:

  • The multi-threading has substantial locking overhead that is detrimental for all but the largest benchmarks. Reintroducing a single-threaded variant would be useful.
  • The cost cutoff has a bug where it degrades solution quality substantially. It must somehow be too aggressive, but I haven't found where.
  • Merging a partial solution with a root candidate now arbitrarily favors one side on a node choice conflict. What is the optimal strategy here?
  • As a consequence of the above, the partial solutions depend on the order of root traversal. Randomizing here helps by introducing more structural variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant