Skip to content

Conversation

Wheest
Copy link
Contributor

@Wheest Wheest commented Oct 13, 2025

It makes sense that Attribute dicts/maps should behave like dicts in the Python bindings. Previously this was not the case.

@Wheest Wheest marked this pull request as ready for review October 13, 2025 14:19
@llvmbot llvmbot added the mlir label Oct 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 13, 2025

@llvm/pr-subscribers-mlir

Author: Perry Gibson (Wheest)

Changes

It makes sense that Attribute dicts/maps should behave like dicts in the Python bindings. Previously this was not the case.


Full diff: https://github.com/llvm/llvm-project/pull/163200.diff

2 Files Affected:

  • (modified) mlir/lib/Bindings/Python/IRCore.cpp (+46-1)
  • (modified) mlir/test/python/ir/operation.py (+23-6)
diff --git a/mlir/lib/Bindings/Python/IRCore.cpp b/mlir/lib/Bindings/Python/IRCore.cpp
index 7b1710656243a..a94674079bf1c 100644
--- a/mlir/lib/Bindings/Python/IRCore.cpp
+++ b/mlir/lib/Bindings/Python/IRCore.cpp
@@ -2730,6 +2730,16 @@ class PyOpAttributeMap {
         operation->get(), toMlirStringRef(name)));
   }
 
+  template <typename F>
+  auto forEachAttr(F fn) {
+    intptr_t n = mlirOperationGetNumAttributes(operation->get());
+    for (intptr_t i = 0; i < n; ++i) {
+      MlirNamedAttribute na = mlirOperationGetAttribute(operation->get(), i);
+      MlirStringRef name = mlirIdentifierStr(na.name);
+      fn(name, na.attribute);
+    }
+  }
+
   static void bind(nb::module_ &m) {
     nb::class_<PyOpAttributeMap>(m, "OpAttributeMap")
         .def("__contains__", &PyOpAttributeMap::dunderContains)
@@ -2737,7 +2747,42 @@ class PyOpAttributeMap {
         .def("__getitem__", &PyOpAttributeMap::dunderGetItemNamed)
         .def("__getitem__", &PyOpAttributeMap::dunderGetItemIndexed)
         .def("__setitem__", &PyOpAttributeMap::dunderSetItem)
-        .def("__delitem__", &PyOpAttributeMap::dunderDelItem);
+        .def("__delitem__", &PyOpAttributeMap::dunderDelItem)
+        .def("__iter__",
+             [](PyOpAttributeMap &self) {
+               nb::list keys;
+               self.forEachAttr([&](MlirStringRef name, MlirAttribute) {
+                 keys.append(nb::str(name.data, name.length));
+               });
+               return nb::iter(keys);
+             })
+        .def("keys",
+             [](PyOpAttributeMap &self) {
+               nb::list out;
+               self.forEachAttr([&](MlirStringRef name, MlirAttribute) {
+                 out.append(nb::str(name.data, name.length));
+               });
+               return out;
+             })
+        .def("values",
+             [](PyOpAttributeMap &self) {
+               nb::list out;
+               self.forEachAttr([&](MlirStringRef, MlirAttribute attr) {
+                 out.append(PyAttribute(self.operation->getContext(), attr)
+                                .maybeDownCast());
+               });
+               return out;
+             })
+        .def("items", [](PyOpAttributeMap &self) {
+          nb::list out;
+          self.forEachAttr([&](MlirStringRef name, MlirAttribute attr) {
+            out.append(
+                nb::make_tuple(nb::str(name.data, name.length),
+                               PyAttribute(self.operation->getContext(), attr)
+                                   .maybeDownCast()));
+          });
+          return out;
+        });
   }
 
 private:
diff --git a/mlir/test/python/ir/operation.py b/mlir/test/python/ir/operation.py
index cb4cfc8c8a6ec..9d49cb1d25f9e 100644
--- a/mlir/test/python/ir/operation.py
+++ b/mlir/test/python/ir/operation.py
@@ -569,14 +569,31 @@ def testOperationAttributes():
     # CHECK: Attribute value b'text'
     print(f"Attribute value {sattr.value_bytes}")
 
+    # Python dict-style iteration
     # We don't know in which order the attributes are stored.
-    # CHECK-DAG: NamedAttribute(dependent="text")
-    # CHECK-DAG: NamedAttribute(other.attribute=3.000000e+00 : f64)
-    # CHECK-DAG: NamedAttribute(some.attribute=1 : i8)
-    for attr in op.attributes:
-        print(str(attr))
+    # CHECK-DAG: dependent
+    # CHECK-DAG: other.attribute
+    # CHECK-DAG: some.attribute
+    for name in op.attributes:
+        print(name)
+
+    # Basic dict-like introspection
+    # CHECK: True
+    print("some.attribute" in op.attributes)
+    # CHECK: False
+    print("missing" in op.attributes)
+    # CHECK: Keys: ['dependent', 'other.attribute', 'some.attribute']
+    print("Keys:", sorted(op.attributes.keys()))
+    # CHECK: Values count 3
+    print("Values count", len(op.attributes.values()))
+    # CHECK: Items count 3
+    print("Items count", len(op.attributes.items()))
+
+    # Dict() conversion test
+    d = {k: v.value for k, v in dict(op.attributes).items()}
+    # CHECK: Dict mapping {'dependent': 'text', 'other.attribute': 3.0, 'some.attribute': 1}
+    print("Dict mapping", d)
 
-    # Check that exceptions are raised as expected.
     try:
         op.attributes["does_not_exist"]
     except KeyError:

@joker-eph
Copy link
Collaborator

One general remark around handling of this in the bindings is that there is some underlying changes planned on the whole infra around attributes for the migration to properties which are to incur some future important breaking changes to code that exposes inherent attributes in a dictionary fashion.

@Wheest
Copy link
Contributor Author

Wheest commented Oct 13, 2025

One general remark around handling of this in the bindings is that there is some underlying changes planned on the whole infra around attributes for the migration to properties which are to incur some future important breaking changes to code that exposes inherent attributes in a dictionary fashion.

Thanks, I wasn’t aware of this proposal. From what I gather, it moves an operation’s inherent data into a separate property dictionary, while the attribute dictionary will continue to hold discardable data.

Since the attribute dict still exists in dictionary form, the feature in this PR should remain functional. Once the property dict is merged, we’d likely want to expose equivalent behavior there too.

@joker-eph
Copy link
Collaborator

joker-eph commented Oct 13, 2025

Thanks, I wasn’t aware of this proposal.

It's been coming for 2y, but we've been slow to make significant progress, but still felt it was a good idea to raise awareness on it :)

From what I gather, it moves an operation’s inherent data into a separate property dictionary, while the attribute dictionary will continue to hold discardable data.

Almost: inherent properties aren't necessarily a dictionary, it can be anything: it's "raw C++" and may not offer a key-value pair easy access.
It'll require specific bindings instead of a generic introspectable API.

Note also: this is for a large part already implemented. The fact that you exposes an "attributes" API is relying on the code in MLIR dynamically trying to merge it all in a DictionnaryAttr on the fly.
This is inefficient (and not totally correct in all cases), we should instead have two separate APIs to access the discardable and inherent data separately.
This "backward compatibility" is likely to go away at some point, but we shouldn't make it more engrained in the meantime: I rather see the bindings reflecting the split that was introduced two years ago in the core C++ APIs when accessing discardable attributes separately from inherent ones.

@makslevental
Copy link
Contributor

Almost: inherent properties aren't necessarily a dictionary, it can be anything: it's "raw C++" and may not offer a key-value pair easy access.

I rather see the bindings reflecting the split that was introduced two years ago in the core C++ APIs when accessing discardable attributes separately from inherent ones.

The Python bindings use the C API so I don't see how we can support arbitrary "raw C++".

@makslevental
Copy link
Contributor

but we shouldn't make it more engrained in the meantime

Given that we will have to update a lot of this code when the migration is complete there's no reason to block on it (the migration) for an improvement in UX that's only marginally coupled.

Copy link
Contributor

@makslevental makslevental left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM module one final nit. Thanks!

@joker-eph
Copy link
Collaborator

Almost: inherent properties aren't necessarily a dictionary, it can be anything: it's "raw C++" and may not offer a key-value pair easy access.

I rather see the bindings reflecting the split that was introduced two years ago in the core C++ APIs when accessing discardable attributes separately from inherent ones.

The Python bindings use the C API so I don't see how we can support arbitrary "raw C++".

Through generation of C bindings for properties?

@joker-eph
Copy link
Collaborator

for an improvement in UX that's only marginally coupled.

If you're improving the UX of an API we should instead deprecate and replace, then I don't agree.

@makslevental
Copy link
Contributor

If you're improving the UX of an API we should instead deprecate and replace, then I don't agree.

When you have a replacement, we can consider your alternative.

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires more discussion right now.

@makslevental
Copy link
Contributor

makslevental commented Oct 13, 2025

This requires more discussion right now.

As usual I'm not sure where you draw the authority to block other people's work given purely "feelings" based arguments. As opposed to feelings, here are facts:

  1. The "migration" as you call it has been on going for two years and no one of the migrators has made any attempt to produce a working alternative to what we in the bindings have now (if you have a PR that's passing tests I'd be glad to review).
  2. The API that this couples to is mlirOperationGetAttribute (via a nicely factored helper forEachAttr), which appears in 8 other places in this file. According to your reasoning, we should stop absolutely all development of those pieces of functionality until ... your ever-forthcoming migration takes place. Manifestly, in retrospect, that is not how we've operated, nor is now suddenly a moment where we should deviate from this approach (unless you have some exigencies that you haven't made us aware of).
  3. Even if this did incur debt (because we continue to develop APIs that will be deprecated), that debt falls on the maintainers of the bindings. As I already said: I'm perfectly fine updating all of these uses to do whatever it is you imagine you would have them do once your alternative is ready. So there is no problem here either.

I look forward to your discussion but until then (until you provide actual technical reasons), I'd ask you to remove the block.

Copy link
Contributor

@rolfmorel rolfmorel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me.

As far as I can tell, this change doesn't really further engrain the usage of the non-properties attributes API. It just completes the set of methods the attribute map, as a Python dict-like, is expected to have. That is, the existing implementation, i.e. the other methods, have as much of a dependency on this API as the new methods.

While it would be nice to make progress on the properties-aware API, I don't see why this particular PR would need to be blocked until that progress is made.

@jpienaar
Copy link
Member

This changes existing support to be less surprising for Python users, but doesn't add anything new.

which are to incur some future important breaking changes to code that exposes inherent attributes in a dictionary fashion.

Yes but also not necessarily issue for Python API: for inherent attributes one can generate a dictionary still and populate it. This is within the control of bindings. So the functionality here can still be provided in that world with a change on how bindings are generated. And users can be migrated without breaking iterators here. For unregistered operations I believe there is just a single attrdict. And it doesn't preclude whether one then iterates over intrinsic vs discardable, or both, but that's again possible to stage so that there are less breakages of users. Python API doesn't have stability guarantees, but we can avoid breaking many.

The Python bindings use the C API so I don't see how we can support arbitrary "raw C++".

Through generation of C bindings for properties?

That would seem to require a superset of what Nanobind can do (automatically generate C API and then generate Python API to access arbitrary C++ via C API). That seems really tricky. Attributes work as most are built up of very simple and constrained set. This is a hard blocker for migration to Properties but not related to this change IMHO.

@rengolin
Copy link
Member

This:

If you're improving the UX of an API we should instead deprecate and replace, then I don't agree.

And:

The "migration" as you call it has been on going for two years and no one of the migrators has made any attempt to produce a working alternative to what we in the bindings have now

And:

That would seem to require a superset of what Nanobind can do (automatically generate C API and then generate Python API to access arbitrary C++ via C API). That seems really tricky. Attributes work as most are built up of very simple and constrained set. This is a hard blocker for migration to Properties but not related to this change IMHO.

Are really strong reasons to accept this change for what it is, allowing current users (attributes) to have a better experience, since there's no telling when (or even if) the migration to properties will happen.

This is a common situation in LLVM, where multiple implementations co-exist for some time, and it's up to the maintainers to keep up with the churn. Since @makslevental volunteered to do that work, I see no reason to block this PR from merging.

@joker-eph
Copy link
Collaborator

joker-eph commented Oct 15, 2025

I'm a bit puzzled by the line of comments here, which don't seem like trying to build a shared understanding here and making progress. Maybe I can start to elaborate here on why I mentioned that we should discuss a bit more.

So maybe we start with different assumptions or knowledge about the state of the codebase and the general direction.
First, the python bindings are, as the name indicates, bindings to MLIR. As such I expect that we keep a strong alignment between the evolution of MLIR concepts and APIs, and the bindings: they can't just evolve entirely separately, it's not an independent project. Of course the bindings are offering some more "pythonic" UI as well, often as layer on top of the lowest level of bindings, where things can diverge from some of the C++ layer because of the differences in the languages (I'm thinking about the builders for op with regions for example, or the use of python with context, etc.).

Now, onto the state of discardable/inherent attributes, and more generally properties: it seems that there is a lot of misconceptions about where we are and where we're going.
While introducing properties, we also formally separated discardable and inherent attributes and started reflecting this in the API of the Operation class. There are now APIs like getDiscardableAttrDictionary(), getDiscardableAttrs(), getRawDictionaryAttrs(), ... and also getInherentAttr intended to expose inherent attributes during the transition.
For backward compatibility, we tried to preserve (with some quirks, but mostly working) pre-existing APIs that are mixing-up discardable and inherent attributes: these are fragiles and we need to migrate away from them in order to formally deprecate them and remove them. This in-tree migration is what I referred to earlier as being incomplete: we can't remove the APIs that we're still having in-tree use for!
However the fact that we still have in-tree users for the old API does not change the fact that we need to promote and encourage APIs that are separating access to inherent and discardable attributes, as these have different storage and really different "namespace" (you can have an attribute with the same "key" for a discardable and an inherent attribute, which would expose how buggy the old API is: one of them would overwrite the other somehow).

Now, getting to Python side, comes the question about these APIs and their exposure: the reason I'm bringing awareness here (and raising some concerns over misalignment) is that while this patch seems like a good idea (it makes something clearly more "pythonic" and is a better UI), it is doing over an API that is on its way to deprecation and for which the replacement is available in tree for >2 years. So I'm asking why aren't exposing the new API instead and applying the great improvements in look&feel to the APIs that we want user to actually use. Python has a 2y lag on the C++ API here, and the longer we wait the more brutal the deprecation (and removal) of the older API will be, which will make users of this API less happy, which I suspect is something we all strive to avoid (unnecessarily at least).

@makslevental
Copy link
Contributor

makslevental commented Oct 15, 2025

So I'm asking why aren't exposing the new API instead and applying the great improvements in look&feel to the APIs that we want user to actually use.

The answer has already been provided to you: because this patch is wholly orthogonal to that work. Let me put it in terms of a very simple metaphor: if the foundation of my house needs replacing, I can still fix the lock on my door (to replace a faulty mechanism, to prevent thieves, to make ingress/egress easier for laborers working on the foundation, etc) even if the new foundation will require eventually a new door frame.

However the fact that we still have in-tree users for the old API does not change the fact that we need to promote and encourage APIs...

I don't know about others in this PR but I certainly do not feel encouraged when someone shows up and demands people interrupt their work to satisfy arbitrary, unrelated goals. In fact, I feel extremely discouraged about my (and others') ability to make meaningful progress on work without constantly having to satisfy someone's demands. Furthermore, as we always say: "patches always welcome". If you have been dissatisfied for an entire 2 years with the relative lag of the bindings behind core (in our disuse of the new APIs), you should have felt at liberty to send PRs rectifying our disuse! And your own dissatisfaction! In fact you should feel free even now, today, right this moment to send any such PRs!

which will make users of this API less happy, which I suspect is something we all strive to avoid (unnecessarily at least).

If your actual concern is the user facing API, then I feel you do not understand the patch itself (as I often feel at these impasses) - this PR does not in any way change the actual user facing API. What it does is complete the currently incomplete implementation of the dict protocol. In whatever future you envision, this is the contract you will have to fulfill, irrespective of whether the current contract is unfulfilled. By blocking this PR, all you are doing is gating the correct implementation of that contract behind properties, which is a complete non-sequitur.

Thus, I am once again asking you, politely, to remove the block, and move this discussion elsewhere (e.g., a discourse post).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants