-
Notifications
You must be signed in to change notification settings - Fork 83
Fix #724: Clarify empty e-class serialization #726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix #724: Clarify empty e-class serialization #726
Conversation
- Empty e-classes now use "" instead of "[...]" to distinguish from omitted nodes - Add empty_eclasses field to SerializeOutput to track empty e-classes - Add warnings for empty e-classes in serialization output - Update omitted_description() to include empty e-class warnings - Update is_complete() to check for empty e-classes
CodSpeed Performance ReportMerging #726 will not alter performanceComparing Summary
Footnotes
|
|
@ezrosent i have fixed the issue can you review my pr and merge it |
saulshanabrook
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! Left a few comments.
| assert_eq!(serialize_output.empty_eclasses.len(), serialize_output.empty_eclasses.len()); | ||
|
|
||
| // If there are empty e-classes, verify they use "" instead of "[...]" | ||
| if !serialize_output.empty_eclasses.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you verify here that the result is not complete and the description includes empty? Like this statement should always true in this test, right?
| if functions_kept >= max_functions { | ||
| discarded_functions.push(name.clone()); | ||
| // Track outputs of discarded functions | ||
| self.backend.for_each_while(function.backend_id, |row| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok to not count as empty e-classes that were omitted due to max limits as well.
Basically like once we reach max functions or nodes, we can treat any pointers to those e-classes as omitted not empty.
The key thing is making sure that if we aren't having max nodes then empty e-classes are properly omitted.
So I think a smaller fix here would be to leave all this code alone, and in the later part when we check if we should omit an omitted or an empty e-class node, just omit it as empty if we haven't skipped any functions/nodes and omitted if we have.
This is an under approximarion for empty, but it should work for the use case in the issue and keeps the egraph from having to continue being traversed when we reach our max nodes.
| if is_empty { | ||
| // Empty e-class: use empty string as the name | ||
| serializer.empty_eclasses.push(class_id.to_string()); | ||
| let node_id = self.to_node_id(Some(sort), SerializedNode::Dummy(value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could split Dummy into two versions. Omitted and EmptyEClass. So that we know which one it's pointing at.
| ); | ||
| VecDeque::from(vec![node_id]) | ||
| } else { | ||
| // Omitted due to size constraints: use "[...]" as before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: could you collapse this conditional since most of it is the same and just add a smaller one for the op and type?
| serializer.result.nodes.insert( | ||
| node_id.clone(), | ||
| egraph_serialize::Node { | ||
| op: "".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be helpful to be more explicit here. Like EMPTY-ECLASS or something.
| if node.op == "" { | ||
| let class_id = &node.eclass; | ||
| assert!(serialize_output.empty_eclasses.iter().any(|e| e == &class_id.to_string())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also verify that serialized node type when parsed from the serialized egraph is set to EmptyEClass? (Prev Omitted)
Summary
This PR fixes issue #724 by clarifying how empty e-classes are serialized. Previously, empty e-classes (deleted or never had nodes) were serialized as
[...], which was the same as when nodes were omitted due to size constraints.Changes
""(empty string) instead of"[...]"to distinguish them from omitted nodesempty_eclassesfield toSerializeOutputto track empty e-classes separatelyomitted_description()to include empty e-class warningsis_complete()to check for empty e-classes as wellImplementation Details
The fix distinguishes between:
"""[...]"(unchanged)This is achieved by tracking which e-classes appear as outputs of any function (even truncated/discarded ones) during serialization, allowing us to distinguish between truly empty e-classes and ones that were omitted.
Testing
test_serialize_empty_eclass()to verify the functionalityRelated Issue
Fixes #724