Skip to content

Conversation

@oflatt
Copy link
Member

@oflatt oflatt commented Apr 10, 2025

No description provided.

@codspeed-hq
Copy link

codspeed-hq bot commented Apr 10, 2025

CodSpeed Performance Report

Merging #538 will not alter performance

Comparing oflatt-fn-encoding (fffc4e9) with oflatt-fn-encoding (0c37937)

Summary

✅ 10 untouched benchmarks


# Known limitations

- Primitive outputs have to be wrapped in a constructor, which can be inconvenient
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think primitive inputs need to be wrapped as well (e.g., how to represent NumOp in your example below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, that's true
(unless we generate specialized PartialApp operators for each type instead of type erasure)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am anti type erasure, but it think it is needed to support nested PartialApps as in the last example.

@Alex-Fischman
Copy link
Collaborator

I see two different end goals for higher-order functions in egglog:

  1. We do the trick (I think it's called the demand transformation?) to compile a functional program into Datalog.
  2. We extend the core to handle functions/primitives as first-class values.

This proposal moves us towards Option 1. However, I also think that Option 2 is worth considering, and hasn't been discussed much. In particular, I think @FTRobbin has some thoughts on how UnstableFn is less expressive than Option 2, and I want to learn more about the differences between full first-class functions and UnstableFn before making a value judgement between the two options.

One upside of Option 1 is that it can be done entirely in egglog-experimental, while Option 2 probably requires changes to core egglog.

(There is also of course the secret Option 3: we declare that higher-order functions are out-of-scope, and force users to do Option 1 themselves if they really think they want it.)

@Alex-Fischman
Copy link
Collaborator

I didn't realize that others weren't thinking about this difference; when I was saying that it was impossible to implement UnstableFn purely in experimental in the meeting, I should have clarified that I was only talking about Option 2. Sorry for snapping at you Oliver.

- Primitives can't have type information
- Support for the REPL (not a whole-program transformation)

# Known limitations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this approach makes resugaring for proofs harder? Or is it about as hard as it was before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also some of the same issues for visualization and extraction - we would have to de-encode to deal with them in some way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that proofs / visualizations are no harder or easier whatever we decide

@Alex-Fischman
Copy link
Collaborator

More detail on Option 2: supporting higher-order functions similar to C's function pointers requires the following things:

  • Syntax to refer to the name of a table/primitive
  • That syntax gets resolved to a specific known instance, with type information, during typechecking
  • The resolved function gets stored with its types in the new backend as a Value
  • We expose a way to call that Value from inside an ExternalFunction

I think it makes sense to distinguish higher-order functions from closures here because only closures need to be rebuilt.
Supporting closures becomes much easier once we have higher-order functions though; I think it's possible to do in experimental with existing interfaces if we have "function pointers" in core.

@oflatt
Copy link
Member Author

oflatt commented Apr 11, 2025

@Alex-Fischman
Oh I see, I didn't realize you were thinking of adding function pointers but not closures. Sorry for the misunderstanding. I'd also be interested in the limitations of unstable functions and option 1 compared to option 2.

@Alex-Fischman
Copy link
Collaborator

It's not that I don't want to add closures, but that I'm trying to figure out which piece is impossible to do in experimental.

@FTRobbin
Copy link
Collaborator

FTRobbin commented Apr 14, 2025

I see that we are discussing two interrelated problems here. So let me first separate them for clarity:

(1) What variant of the unstable-fn feature will we provide in the June Vision?
(2) How do we implement that? What changes do we need to make to the core?

For (1), I claim we will not be able to come up with the "right" version due to our tight time budget. Two ways to approach the "right" answer would be (a) iterations with users as Saul suggested, which seems unlikely to happen before June; or (b) do a PL theory dive to find similar design questions that have been answered - I did some quick stabs at this but did not find an obvious answer that would serve our case well. Going down this path would also take much more effort than we have.

However, we still have to decide on (1), given we will most likely get the design "wrong". This is where Zach's point comes in: whatever "wrong" thing we end up with, we should let that thing reside in the experimental while keeping the core clean. In this way, we do not prevent a future "less wrong" thing from being done.

In this view, the answer to (1) is anything that we can implement while keeping the core clean.

But this pushes more complexity into (2). We now need to think about not only how to accommodate the one version we pick in the core/experimental split but also potential features. "Hackability" as coined by Zach: after June, can someone pull egglog and hack a feature like this in the experimental? For comparison, egg (at the beginning) is great at this.

It seems to me that moving in this direction would call for a set of "unsafe" APIs being exposed by the core to the experimental for expressive power. I imagine maybe that can only be unlocked by a combination of custom primitives + custom plugins.

@saulshanabrook
Copy link
Member

saulshanabrook commented Jul 16, 2025

I was also thinking this technique could work without having to use the Any type and wrapping constructors... If instead, you generate a new constructor when you need it for each set of types of functions.

For the example in the PR:

(datatype Math
  (Num i64)
  (Var String)
  (Add Math Math)
  (Mul Math Math))
(rewrite (Mul (Num x) (Num y)) (Num (* x y)))
(datatype MathList
  (Nil)
  (Cons Math MathList))
(sort MathFn (UnstableFn (Math) Math))
(constructor square (Math) Math)
(rewrite (square x) (Mul x x))
(let square-fn (unstable-fn "square" ))
;; test that we can call a function
(let squared-3 (unstable-app square-fn (Num 3)))
(check (= squared-3 (square (Num 3))))

It could produce this program:

(datatype Math
  (Num i64)
  (Var String)
  (Add Math Math)
  (Mul Math Math))
(rewrite (Mul (Num x) (Num y)) (Num (* x y)))
(datatype MathList
  (Nil)
  (Cons Math MathList))
(sort MathFn)
(constructor ApplyMathFn (MathFn Math) Math)

(constructor square (Math) Math)
(rewrite (square x) (Mul x x))

;; generated
(constructor SquareMathFn () MathFn)
(rewrite (ApplyMathFn (SquareMathFn) x) (square x))

(let square-fn (SquareMathFn))
;; test that we can call a function
(let squared-3 (ApplyMathFn square-fn (Num 3)))
(check (= squared-3 (square (Num 3))))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants