Skip to content

Conversation

@rolfmorel
Copy link
Contributor

@rolfmorel rolfmorel commented Nov 11, 2025

Simple demonstration of applying a schedule to a payload.

Relies on the little wrapper method NamedSequenceOp.apply, now existing upstream, to make things as simple as can be.

Simple wrapper around upstream's transform_interpreter API. Invokable
directly from Python and via the commandline on separate payload and
schedule files.

def example_schedule() -> ir.Module:
schedule = ir.Module.create()
schedule.operation.attributes["transform.with_named_sequence"] = ir.UnitAttr.get()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if the transform module provided this as a helper, so that users only need to add the transforms themselves.

Copy link
Contributor Author

@rolfmorel rolfmorel Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we should probably be facilitating dealing with such details a bit more. The thing I would not like is to obscure that we are just constructing IR, i.e. to me this code still reflects what the .mlir will look like. We should find a middle ground somehow. So, the helpers should look like IR builders as well ... probably?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On returning to the PR: I think this is only non-ergonomic in that one must intersperse operation between schedule and attributes (an upstream issue). Beyond that, this API usage properly reflects how IR is build, in as terse a manner as upstream supports.

Separately, we could of course have helpers to, e.g. to wrap named_sequences up in appropriate modules. As that's not entirely trivial, e.g. multiple sequences will live in the same module, so we cannot simply wrap a single named_sequence, I think we can punt this until we start observing the kind of replication we get across the repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to dissociate a bit the transform helpers from actual IR, because while the underlying IR shape may change, the API should not. If all our APIs follow the IR shape closely, then we'll have to change their usage every time the IR changes, and that doesn't scale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can improve this. I don't think the right solution is obvious though (go ahead and play around with it -- I am sure @makslevental agrees) and don't think this is the PR to solve it. Lets just merge this and address the right kind of wrappers in a dedicated PR (such a more focused PR is also like to engage the ex-Brium folks more, I imagine).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Renato's feedback here.

At a higher level, I am already struggling to parse the code-structure within Lighthouse. For example:

What's the intended difference between the two? I feel similarly about https://github.com/llvm/lighthouse/tree/main/python/examples and how that dir is evolving.

IMO, we should think carefully about the structure and these initial PRs are key. While I agree that it's hard to design future-proof APIs, there's scope to improve modularity here. And to identify clear boundaries between "components" within this PR. Specifically (*):

  1. Payload IR generation ("Ingress").
  2. Schedule IR generation ("Auto-tunning").
  3. Driver, i.e. generate payload -> apply schedule ("Execution + driver").

There's yet another step - testing - that is also included here. These are all fairly fundamental design elements.

I am not advocating for us identifying the perfect solution within this PR and I am happy for us to iterate in-tree. That said, I will leave some inline comments and do suggest that the newly added example is more modular.

(*) Using labels from https://github.com/llvm/lighthouse/wiki/Integrator#design-points in "".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the discussion on general infrastructure is here:
#16

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structuring of the repo is a distinct point from what this chain was about before: more convenience helpers.

I agree that mlir-gen (as a python module name: mlir_gen) should move inside python/lighthouse/ingress or else into a tools dir (and, as I've said before, IMO that leading python dir is redundant). I will do this clean-up in another PR👍

For the examples dir, I currently only see a single quirk: python/examples/mlir/compile_and_run.py. Otherwise the hierarchy mirrors that of the modules in python/lighthouse. Did you have something else in mind?

As for testing, I will now enable CI with this PR.

Copy link
Contributor

@adam-smnk adam-smnk Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and, as I've said before, IMO that leading python dir is redundant

+1
It also just hides examples which is better to have at the top-level. As a new comer to a repo, I want to find that first.

For the examples dir, I currently only see a single quirk: python/examples/mlir/compile_and_run.py. Otherwise the hierarchy mirrors that of the modules in python/lighthouse.

I don't think examples need to match overall modules structure. They largely do now as it makes sense thematically i.e., how to import a torch model into MLIR uses ingress modules (not saying current structure is perfect or immutable). But examples could be more abstract then individual modules lighthouse provides and/or span across multiple modules.

Closer structural mirroring would make sense for a test dir which we could use soon.

@rengolin rengolin requested a review from Groverkss November 11, 2025 13:24
Copy link
Contributor

@adam-smnk adam-smnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really neat and easy to use 👍

Now requires eb9d56c (or later) of llvm-project
Copy link

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very exciting - thanks for driving this forward!

Before we land it, I think we should get CI set up and confirm that the current changes actually run end-to-end. Right now, I haven’t been able to run anything from Lighthouse. This might just be an issue with my local setup, but without CI we can’t easily tell whether it’s a configuration problem or a deeper design/assumption issue.

Copy link

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

I am leaving some more comments inline. Also, now that we have CI, could you add this example there? Thanks!

from mlir.dialects.transform import structured


def example_payload() -> Module:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets avoid generic names like @example_payload and use something descriptive instead. For example, what name would we use for the next example? @example_payload_1? That doesn't scale 😅

How about generate_payload_two_matmuls_and_add? We could skip generate_payload if the filename was ... generate_payload.py or something similar ;-) Yes, I do think that having separate files would help.

Copy link
Contributor

@adam-smnk adam-smnk Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to preface, I see your point about setting a good example (pun intended) around naming.
Not sure if it's needed in this particular case. At least from the the perspective how I approach it here.
If we go with a more granular approach of multiple files with small examples working together (like you propose in another comment), then it might need different design approach, indeed.

I'd argue that specificity adds more information and implies that sth about the exact form/shape/implementation is important in a presented item. This addition can add to or distract from the core message.

I see this file as a self-contained example that focuses primarily on mechanism behind taking two MLIR modules: payload IR and a schedule, and executing them.
As such, I doubt there's need for scaling. Each standalone example script could have @example_payload as long as that specific payload doesn't matter for the overall idea we're communicating.
This particular IR could be an empty function and little would change (% lit checks and perhaps some user confusion due to "uselessness" of a schedule doing effectively nothing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @adam-smnk ! That captures my perspective on what is happening here very well!

Comment on lines 107 to 113
cmdline = [
"python",
"-m",
"lighthouse.schedule",
schedule_file.name,
payload_file.name,
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we didn't generate python invocation lines from within Python. This means that there is no separation of concerns and it's quite hard to extract/learn what exactly needs to happen (i.e. what the steps are).

In particular, to me, this script is trying to achieve three things in one go:

  1. Generate Payload IR (most likely generating Linalg or Vector Ops)
  2. Generate Schedule IR (generates TD Ops).
  3. Generate cmdline and invoke it (orthogonal to MLIR generation).

These are 3 separate tasks, each of which comes with its own set of complexities and challenges. Also, ATM, both __main__.py and transform_a_payload_according_to_a_schedule.py are runnable. So, IIUC, there are two ways to run transform_a_payload_according_to_a_schedule.py? If "yes", why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we didn't generate python invocation lines from within Python.
It's removed now.

The three tasks are just because it's the minimal thing we need for a full example. For non-example code, the code for the distinct tasks will be more structured.


def example_schedule() -> ir.Module:
schedule = ir.Module.create()
schedule.operation.attributes["transform.with_named_sequence"] = ir.UnitAttr.get()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Renato's feedback here.

At a higher level, I am already struggling to parse the code-structure within Lighthouse. For example:

What's the intended difference between the two? I feel similarly about https://github.com/llvm/lighthouse/tree/main/python/examples and how that dir is evolving.

IMO, we should think carefully about the structure and these initial PRs are key. While I agree that it's hard to design future-proof APIs, there's scope to improve modularity here. And to identify clear boundaries between "components" within this PR. Specifically (*):

  1. Payload IR generation ("Ingress").
  2. Schedule IR generation ("Auto-tunning").
  3. Driver, i.e. generate payload -> apply schedule ("Execution + driver").

There's yet another step - testing - that is also included here. These are all fairly fundamental design elements.

I am not advocating for us identifying the perfect solution within this PR and I am happy for us to iterate in-tree. That said, I will leave some inline comments and do suggest that the newly added example is more modular.

(*) Using labels from https://github.com/llvm/lighthouse/wiki/Integrator#design-points in "".

@rolfmorel rolfmorel changed the title [transform] Simple Python and cmdline interface for applying schedules [transform] Basic example of applying a schedule to a payload Nov 19, 2025
@rolfmorel
Copy link
Contributor Author

rolfmorel commented Nov 19, 2025

Have simplied so it is hopefully easier to go in.

The commandline functionality is now gone (can be re-introduced when we need it).

Also runs in CI now (will add CHECK lines back upon enabling CI with lit -- the next PR to be opened).

And to note: NamedSequenceOp.apply was initially a function that this PR was introducing. It was moved upstream in llvm/llvm-project#168223, hence simplifying this PR.

Copy link

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rolfmorel rolfmorel merged commit fa58880 into main Nov 19, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants