Skip to content

Commit 41bb6ed

Browse files
authored
[mlir][docs] Migrate code examples to nanobind, make Python spelling … (#163933)
…consistent Since the bindings now use nanobind, I changed the code examples and mentions in the documentation prose to mention nanobind concepts and symbols wherever applicable. I also made the spelling of "Python" consistent by choosing the uppercase name everywhere that's not an executable name, part of a URL, or directory name. ---------------- Note that I left mentions of `PybindAdaptors.h` in because of llvm/llvm-project#162309. Are there any thoughts about adding a virtual environment setup guide using [uv](https://docs.astral.sh/uv/)? It has gotten pretty popular, and is much faster than a "vanilla" Python pip install. It can also bootstrap an interpreter not present on the user's machine, for example a free-threaded Python build, with the `-p` flag to the `uv venv` virtual environment creation command.
1 parent e219cf6 commit 41bb6ed

File tree

1 file changed

+51
-33
lines changed

1 file changed

+51
-33
lines changed

mlir/docs/Bindings/Python.md

Lines changed: 51 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
### Pre-requisites
1010

1111
* A relatively recent Python3 installation
12-
* Installation of python dependencies as specified in
12+
* Installation of Python dependencies as specified in
1313
`mlir/python/requirements.txt`
1414

1515
### CMake variables
@@ -27,8 +27,8 @@
2727

2828
### Recommended development practices
2929

30-
It is recommended to use a python virtual environment. Many ways exist for this,
31-
but the following is the simplest:
30+
It is recommended to use a Python virtual environment. Many ways exist for this,
31+
but one of the following is generally recommended:
3232

3333
```shell
3434
# Make sure your 'python' is what you expect. Note that on multi-python
@@ -37,7 +37,22 @@ but the following is the simplest:
3737
which python
3838
python -m venv ~/.venv/mlirdev
3939
source ~/.venv/mlirdev/bin/activate
40+
```
41+
42+
Or, if you have uv installed on your system, you can also use the following commands
43+
to create the same environment (targeting a Python 3.12 toolchain in this example):
44+
45+
```shell
46+
uv venv ~/.venv/mlirdev --seed -p 3.12
47+
source ~/.venv/mlirdev/bin/activate
48+
```
49+
50+
You can change the Python version (`-p` flag) as needed - if you request any Python interpreter
51+
not present on your system, uv will attempt to download it, unless the `--no-python-downloads` option is given.
52+
For information on how to install uv, refer to the official documentation at
53+
https://docs.astral.sh/uv/getting-started/installation/
4054

55+
```shell
4156
# Note that many LTS distros will bundle a version of pip itself that is too
4257
# old to download all of the latest binaries for certain platforms.
4358
# The pip version can be obtained with `python -m pip --version`, and for
@@ -46,14 +61,16 @@ source ~/.venv/mlirdev/bin/activate
4661
# It is recommended to upgrade pip:
4762
python -m pip install --upgrade pip
4863

49-
5064
# Now the `python` command will resolve to your virtual environment and
5165
# packages will be installed there.
5266
python -m pip install -r mlir/python/requirements.txt
5367

68+
# In a uv-generated virtual environment, you can instead run:
69+
uv pip install -r mlir/python/requirements.txt
70+
5471
# Now run your build command with `cmake`, `ninja`, et al.
5572

56-
# Run mlir tests. For example, to run python bindings tests only using ninja:
73+
# Run mlir tests. For example, to run Python bindings tests only using ninja:
5774
ninja check-mlir-python
5875
```
5976

@@ -65,7 +82,7 @@ the `PYTHONPATH`. Typically:
6582
export PYTHONPATH=$(cd build && pwd)/tools/mlir/python_packages/mlir_core
6683
```
6784

68-
Note that if you have installed (i.e. via `ninja install`, et al), then python
85+
Note that if you have installed (i.e. via `ninja install`, et al), then Python
6986
packages for all enabled projects will be in your install tree under
7087
`python_packages/` (i.e. `python_packages/mlir_core`). Official distributions
7188
are built with a more specialized setup.
@@ -74,23 +91,23 @@ are built with a more specialized setup.
7491

7592
### Use cases
7693

77-
There are likely two primary use cases for the MLIR python bindings:
94+
There are likely two primary use cases for the MLIR Python bindings:
7895

7996
1. Support users who expect that an installed version of LLVM/MLIR will yield
8097
the ability to `import mlir` and use the API in a pure way out of the box.
8198

8299
1. Downstream integrations will likely want to include parts of the API in
83100
their private namespace or specially built libraries, probably mixing it
84-
with other python native bits.
101+
with other Python native bits.
85102

86103
### Composable modules
87104

88105
In order to support use case \#2, the Python bindings are organized into
89106
composable modules that downstream integrators can include and re-export into
90107
their own namespace if desired. This forces several design points:
91108

92-
* Separate the construction/populating of a `py::module` from
93-
`PYBIND11_MODULE` global constructor.
109+
* Separate the construction/populating of a `nb::module` from
110+
`NB_MODULE` global constructor.
94111

95112
* Introduce headers for C++-only wrapper classes as other related C++ modules
96113
will need to interop with it.
@@ -130,7 +147,7 @@ registration, etc.
130147

131148
### Loader
132149

133-
LLVM/MLIR is a non-trivial python-native project that is likely to co-exist with
150+
LLVM/MLIR is a non-trivial Python-native project that is likely to co-exist with
134151
other non-trivial native extensions. As such, the native extension (i.e. the
135152
`.so`/`.pyd`/`.dylib`) is exported as a notionally private top-level symbol
136153
(`_mlir`), while a small set of Python code is provided in
@@ -160,7 +177,7 @@ are) with non-RTTI polymorphic C++ code (the default compilation mode of LLVM).
160177
### Ownership in the Core IR
161178

162179
There are several top-level types in the core IR that are strongly owned by
163-
their python-side reference:
180+
their Python-side reference:
164181

165182
* `PyContext` (`mlir.ir.Context`)
166183
* `PyModule` (`mlir.ir.Module`)
@@ -219,23 +236,24 @@ Due to the validity and parenting accounting needs, `PyOperation` is the owner
219236
for regions and blocks. Operations are also the only entities which are allowed to be in
220237
a detached state.
221238

222-
**Note**: Multiple `PyOperation` objects (i.e., the Python objects themselves) can alias a single `mlir::Operation`.
223-
This means, for example, if you have `py_op1` and `py_op2` which wrap the same `mlir::Operation op`
239+
**Note**: Multiple `PyOperation` objects (i.e., the Python objects themselves) can alias a single `mlir::Operation`.
240+
This means, for example, if you have `py_op1` and `py_op2` which wrap the same `mlir::Operation op`
224241
and you somehow transform `op` (e.g., you run a pass on `op`) then walking the MLIR AST via either/or `py_op1`, `py_op2`
225-
will reflect the same MLIR AST. This is perfectly safe and supported. What is not supported is invalidating any
226-
operation while there exist multiple Python objects wrapping that operation **and then manipulating those wrappers**.
227-
For example if `py_op1` and `py_op2` wrap the same operation under a root `py_op3` and then `py_op3` is
228-
transformed such that the operation referenced (by `py_op1`, `py_op2`) is erased. Then `py_op1`, `py_op2`
229-
become "undefined" in a sense; manipulating them in any way is "formally forbidden". Note, this also applies to
230-
`SymbolTable` mutation, which is considered a transformation of the root `SymbolTable`-supporting operation for the
231-
purposes of the discussion here. Metaphorically, one can think of this similarly to how STL container iterators are invalidated once the container itself is changed. The "best practices" recommendation is to structure your code such that
242+
will reflect the same MLIR AST. This is perfectly safe and supported. What is not supported is invalidating any
243+
operation while there exist multiple Python objects wrapping that operation **and then manipulating those wrappers**.
244+
For example if `py_op1` and `py_op2` wrap the same operation under a root `py_op3` and then `py_op3` is
245+
transformed such that the operation referenced (by `py_op1`, `py_op2`) is erased. Then `py_op1`, `py_op2`
246+
become "undefined" in a sense; manipulating them in any way is "formally forbidden". Note, this also applies to
247+
`SymbolTable` mutation, which is considered a transformation of the root `SymbolTable`-supporting operation for the
248+
purposes of the discussion here. Metaphorically, one can think of this similarly to how STL container iterators are invalidated
249+
once the container itself is changed. The "best practices" recommendation is to structure your code such that
232250

233251
1. First, query/manipulate various Python wrapper objects `py_op1`, `py_op2`, `py_op3`, etc.;
234252
2. Second, transform the AST/erase operations/etc. via a single root object;
235253
3. Invalidate all queried nodes (e.g., using `op._set_invalid()`).
236254

237-
Ideally this should be done in a function body so that step (3) corresponds to the end of the function and there are no
238-
risks of Python wrapper objects leaking/living longer than necessary. In summary, you should scope your changes based on
255+
Ideally this should be done in a function body so that step (3) corresponds to the end of the function and there are no
256+
risks of Python wrapper objects leaking/living longer than necessary. In summary, you should scope your changes based on
239257
nesting i.e., change leaf nodes first before going up in hierarchy, and only in very rare cases query nested ops post
240258
modifying a parent op.
241259

@@ -773,7 +791,7 @@ This allows to invoke op creation of an op with a `I32Attr` with
773791
foo.Op(30)
774792
```
775793

776-
The registration is based on the ODS name but registry is via pure python
794+
The registration is based on the ODS name but registry is via pure Python
777795
method. Only single custom builder is allowed to be registered per ODS attribute
778796
type (e.g., I32Attr can have only one, which can correspond to multiple of the
779797
underlying IntegerAttr type).
@@ -795,13 +813,13 @@ either for practicality or to give the resulting library an appropriately
795813

796814
Generally favor converting trivial methods like `getContext()`, `getName()`,
797815
`isEntryBlock()`, etc to read-only Python properties (i.e. `context`). It is
798-
primarily a matter of calling `def_property_readonly` vs `def` in binding code,
816+
primarily a matter of calling `def_prop_ro` vs `def` in binding code,
799817
and makes things feel much nicer to the Python side.
800818

801819
For example, prefer:
802820

803821
```c++
804-
m.def_property_readonly("context", ...)
822+
m.def_prop_ro("context", ...)
805823
```
806824

807825
Over:
@@ -914,17 +932,17 @@ def create_my_op():
914932
The MLIR Python bindings integrate with the tablegen-based ODS system for
915933
providing user-friendly wrappers around MLIR dialects and operations. There are
916934
multiple parts to this integration, outlined below. Most details have been
917-
elided: refer to the build rules and python sources under `mlir.dialects` for
935+
elided: refer to the build rules and Python sources under `mlir.dialects` for
918936
the canonical way to use this facility.
919937

920938
Users are responsible for providing a `{DIALECT_NAMESPACE}.py` (or an equivalent
921939
directory with `__init__.py` file) as the entrypoint.
922940

923941
### Generating `_{DIALECT_NAMESPACE}_ops_gen.py` wrapper modules
924942

925-
Each dialect with a mapping to python requires that an appropriate
943+
Each dialect with a mapping to Python requires that an appropriate
926944
`_{DIALECT_NAMESPACE}_ops_gen.py` wrapper module is created. This is done by
927-
invoking `mlir-tblgen` on a python-bindings specific tablegen wrapper that
945+
invoking `mlir-tblgen` on a Python-bindings specific tablegen wrapper that
928946
includes the boilerplate and actual dialect specific `td` file. An example, for
929947
the `Func` (which is assigned the namespace `func` as a special case):
930948

@@ -954,7 +972,7 @@ from ._my_dialect_ops_gen import *
954972

955973
### Extending the search path for wrapper modules
956974

957-
When the python bindings need to locate a wrapper module, they consult the
975+
When the Python bindings need to locate a wrapper module, they consult the
958976
`dialect_search_path` and use it to find an appropriately named module. For the
959977
main repository, this search path is hard-coded to include the `mlir.dialects`
960978
module, which is where wrappers are emitted by the above build rule. Out of tree
@@ -1153,7 +1171,7 @@ subclasses can be defined using
11531171
[`include/mlir/Bindings/Python/PybindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/PybindAdaptors.h)
11541172
or
11551173
[`include/mlir/Bindings/Python/NanobindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h)
1156-
utilities that mimic pybind11/nanobind API for defining functions and
1174+
utilities that mimic pybind11/nanobind APIs for defining functions and
11571175
properties. These bindings are to be included in a separate module. The
11581176
utilities also provide automatic casting between C API handles `MlirAttribute`
11591177
and `MlirType` and their Python counterparts so that the C API handles can be
@@ -1176,11 +1194,11 @@ are available when the dialect is loaded from Python.
11761194
Dialect-specific passes can be made available to the pass manager in Python by
11771195
registering them with the context and relying on the API for pass pipeline
11781196
parsing from string descriptions. This can be achieved by creating a new
1179-
pybind11 module, defined in `lib/Bindings/Python/<Dialect>Passes.cpp`, that
1197+
nanobind module, defined in `lib/Bindings/Python/<Dialect>Passes.cpp`, that
11801198
calls the registration C API, which must be provided first. For passes defined
11811199
declaratively using Tablegen, `mlir-tblgen -gen-pass-capi-header` and
11821200
`-mlir-tblgen -gen-pass-capi-impl` automate the generation of C API. The
1183-
pybind11 module must be compiled into a separate “Python extension” library,
1201+
nanobind module must be compiled into a separate “Python extension” library,
11841202
which can be `import`ed from the main dialect file, i.e.
11851203
`python/mlir/dialects/<dialect-namespace>.py` or
11861204
`python/mlir/dialects/<dialect-namespace>/__init__.py`, or from a separate

0 commit comments

Comments
 (0)