Skip to content

Conversation

@harshasiddartha
Copy link

Summary

This PR fixes issue #724 by clarifying how empty e-classes are serialized. Previously, empty e-classes (deleted or never had nodes) were serialized as [...], which was the same as when nodes were omitted due to size constraints.

Changes

  • Empty e-classes now use "" (empty string) instead of "[...]" to distinguish them from omitted nodes
  • Added empty_eclasses field to SerializeOutput to track empty e-classes separately
  • Added warnings for empty e-classes in serialization output (similar to truncated/discarded function warnings)
  • Updated omitted_description() to include empty e-class warnings
  • Updated is_complete() to check for empty e-classes as well

Implementation Details

The fix distinguishes between:

  1. Empty e-classes: E-classes that never had any nodes (deleted or never created) → Use ""
  2. Omitted e-classes: E-classes that had nodes but were truncated/discarded due to size constraints → Use "[...]" (unchanged)

This is achieved by tracking which e-classes appear as outputs of any function (even truncated/discarded ones) during serialization, allowing us to distinguish between truly empty e-classes and ones that were omitted.

Testing

  • Added test test_serialize_empty_eclass() to verify the functionality
  • All existing tests pass
  • Warnings are displayed in CLI output when serializing e-graphs with empty e-classes

Related Issue

Fixes #724

- Empty e-classes now use "" instead of "[...]" to distinguish from omitted nodes
- Add empty_eclasses field to SerializeOutput to track empty e-classes
- Add warnings for empty e-classes in serialization output
- Update omitted_description() to include empty e-class warnings
- Update is_complete() to check for empty e-classes
@harshasiddartha harshasiddartha requested a review from a team as a code owner October 31, 2025 13:04
@harshasiddartha harshasiddartha requested review from ezrosent and removed request for a team October 31, 2025 13:04
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 31, 2025

CodSpeed Performance Report

Merging #726 will not alter performance

Comparing harshasiddartha:fix/empty-eclass-serialization-clarification (c2e1280) with main (ef90b97)

Summary

✅ 20 untouched
⏩ 190 skipped1

Footnotes

  1. 190 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@harshasiddartha
Copy link
Author

@ezrosent i have fixed the issue can you review my pr and merge it

Copy link
Member

@saulshanabrook saulshanabrook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! Left a few comments.

assert_eq!(serialize_output.empty_eclasses.len(), serialize_output.empty_eclasses.len());

// If there are empty e-classes, verify they use "" instead of "[...]"
if !serialize_output.empty_eclasses.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you verify here that the result is not complete and the description includes empty? Like this statement should always true in this test, right?

if functions_kept >= max_functions {
discarded_functions.push(name.clone());
// Track outputs of discarded functions
self.backend.for_each_while(function.backend_id, |row| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to not count as empty e-classes that were omitted due to max limits as well.

Basically like once we reach max functions or nodes, we can treat any pointers to those e-classes as omitted not empty.

The key thing is making sure that if we aren't having max nodes then empty e-classes are properly omitted.

So I think a smaller fix here would be to leave all this code alone, and in the later part when we check if we should omit an omitted or an empty e-class node, just omit it as empty if we haven't skipped any functions/nodes and omitted if we have.

This is an under approximarion for empty, but it should work for the use case in the issue and keeps the egraph from having to continue being traversed when we reach our max nodes.

if is_empty {
// Empty e-class: use empty string as the name
serializer.empty_eclasses.push(class_id.to_string());
let node_id = self.to_node_id(Some(sort), SerializedNode::Dummy(value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could split Dummy into two versions. Omitted and EmptyEClass. So that we know which one it's pointing at.

);
VecDeque::from(vec![node_id])
} else {
// Omitted due to size constraints: use "[...]" as before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could you collapse this conditional since most of it is the same and just add a smaller one for the op and type?

serializer.result.nodes.insert(
node_id.clone(),
egraph_serialize::Node {
op: "".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be helpful to be more explicit here. Like EMPTY-ECLASS or something.

if node.op == "" {
let class_id = &node.eclass;
assert!(serialize_output.empty_eclasses.iter().any(|e| e == &class_id.to_string()));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also verify that serialized node type when parsed from the serialized egraph is set to EmptyEClass? (Prev Omitted)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clarify empty serializing empty e-classes

2 participants