Skip to content

Conversation

@saulshanabrook
Copy link
Member

@saulshanabrook saulshanabrook commented Oct 26, 2025

Adds experimental support for an AnyExpr object which can act like an existing Python object and pass on all calls to it by forwarding to the underlying object.

Useful for tracing.

TODO:

  • Add support for knowing what expressions where constructed before eval calls, to understand if we get partway through a trace which parts of the computation depend on the evals
  • Add tutorial for using AnyExpr

This PR also makes a number of other changes to get there including changin PyObject's to be stored as pickled bytes, instead of life references. This prevents issues where objects like (1, 0) would hash the same be stored the same as (1.0, 0). By pickliing and storing that we remove any issues based on hashing. It also makes it easier to support unhashable objects but still have consistant IDs for them.

@codspeed-hq
Copy link

codspeed-hq bot commented Oct 26, 2025

CodSpeed Performance Report

Merging #366 will degrade performances by 32.01%

Comparing any-object (f8e8733) with main (40b1326)

Summary

❌ 6 regressions
✅ 8 untouched

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Mode Benchmark BASE HEAD Change
Simulation test_jit[add] 351.1 ms 516.4 ms -32.01%
WallTime test_jit[add] 432 ms 616.8 ms -29.97%
Simulation test_jit[lda] 8.6 s 11.2 s -23.91%
WallTime test_jit[lda] 11.9 s 14.9 s -20.17%
Simulation test_jit[tuple] 607.7 ms 797 ms -23.75%
WallTime test_jit[tuple] 780.6 ms 1,008.7 ms -22.62%

@saulshanabrook
Copy link
Member Author

I believe these regressions are due to slowdowns in the upstream egglog.

@saulshanabrook saulshanabrook marked this pull request as ready for review October 29, 2025 23:29
@saulshanabrook
Copy link
Member Author

@codex can you review this PR

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors how Python objects are stored in the egglog system, switching from storing live references to serializing values with cloudpickle. Key changes include:

  • BREAKING: Drops Python 3.10 support (now requires 3.11+)
  • BREAKING: PyObject values are now stored as cloudpickle bytes instead of live references, allowing duplicates to merge by value
  • Removes the PyObjectSort class in favor of directly handling pickled values
  • Adds __call__ and call_extended methods to PyObject, deprecating py_eval_fn
  • Introduces Ident dataclass to track module information for identifiers
  • Adds __replace_expr__ method to support in-place expression mutations
  • Updates bundled Egglog dependencies and improves error handling

Reviewed Changes

Copilot reviewed 37 out of 42 changed files in this pull request and generated 125 comments.

Show a summary per file
File Description
src/py_object_sort.rs Refactored to use pickled bytes instead of storing live Python references
python/egglog/declarations.py Added Ident class and updated all identifier handling to track modules
python/egglog/runtime.py Added __replace_expr__, improved method descriptors, and preserved method handling
python/egglog/builtins.py Added __call__ method to PyObject and deprecated py_eval_fn
stubtest_allow Updated to allow EGraph.__init__ instead of removed PyObjectSort.__init__
pyproject.toml Bumped minimum Python to 3.11 and added cloudpickle dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

eventTime: Y.eventTime,
lane: Y.lane,
tag: Y.tag,
payload: Y.payload,
callback: Y.callback,
next: null
};
X === null ? L = X = N : (X.next = N, X = N), Y = Y.next;
X === null ? L = X = w : (X.next = w, X = w), Y = Y.next;
} while (Y !== null);
X === null ? L = X = b : (X.next = b, X = b);
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable 'X' is of type object, but it is compared to an expression of type null.

Copilot uses AI. Check for mistakes.
var N = (L.mode & pZ) !== Ql;
return yJ(c, b, d, K, o, N, X);
var w = (L.mode & pZ) !== Ql;
return yJ(c, b, d, K, o, w, X);
}
function iH(c, b, d, o) {
return Me(d, c), d.mode & pZ, SJ(c, b);
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expression has no effect.

Copilot uses AI. Check for mistakes.
El !== null && BZ(ol, El.startTime - Il);
}
}
function Wl(Il, El) {
T = !1, f && (f = !1, Rb()), v = !0;
v = !1, f && (f = !1, Rb()), g = !0;
var sZ = F;
try {
var YZ;
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable YZ.

Copilot uses AI. Check for mistakes.
@saulshanabrook
Copy link
Member Author

@codex can you open a PR to modify the AGENTS.md file to account for some of the incorrect issues this copilot flagged here? The warnings in test_high_level.py about functions with no body having no-op is incorrect because egglog work with uninterpreted functions. Also it is giving lots of warnings in visualizer.js which is generated and is a dependency, so shouldn't be edited.

@chatgpt-codex-connector
Copy link

Summary

  • Added repository instructions noting that the uninterpreted-function helpers in python/tests/test_high_level.py intentionally lack bodies and that python/egglog/visualizer.js is generated and should not be modified manually. AGENTS.mdL1-L4

Testing

  • ⚠️ Not run (not required for documentation-only change).

View task →

@saulshanabrook saulshanabrook merged commit adf9ea3 into main Nov 3, 2025
20 of 21 checks passed
@saulshanabrook saulshanabrook deleted the any-object branch November 3, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants