Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 31 additions & 31 deletions mlir/docs/Bindings/Python.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
### Pre-requisites

* A relatively recent Python3 installation
* Installation of python dependencies as specified in
* Installation of Python dependencies as specified in
`mlir/python/requirements.txt`

### CMake variables
Expand All @@ -27,7 +27,7 @@

### Recommended development practices

It is recommended to use a python virtual environment. Many ways exist for this,
It is recommended to use a Python virtual environment. Many ways exist for this,
but the following is the simplest:

```shell
Expand All @@ -53,7 +53,7 @@ python -m pip install -r mlir/python/requirements.txt

# Now run your build command with `cmake`, `ninja`, et al.

# Run mlir tests. For example, to run python bindings tests only using ninja:
# Run mlir tests. For example, to run Python bindings tests only using ninja:
ninja check-mlir-python
```

Expand All @@ -65,7 +65,7 @@ the `PYTHONPATH`. Typically:
export PYTHONPATH=$(cd build && pwd)/tools/mlir/python_packages/mlir_core
```

Note that if you have installed (i.e. via `ninja install`, et al), then python
Note that if you have installed (i.e. via `ninja install`, et al), then Python
packages for all enabled projects will be in your install tree under
`python_packages/` (i.e. `python_packages/mlir_core`). Official distributions
are built with a more specialized setup.
Expand All @@ -74,23 +74,23 @@ are built with a more specialized setup.

### Use cases

There are likely two primary use cases for the MLIR python bindings:
There are likely two primary use cases for the MLIR Python bindings:

1. Support users who expect that an installed version of LLVM/MLIR will yield
the ability to `import mlir` and use the API in a pure way out of the box.

1. Downstream integrations will likely want to include parts of the API in
their private namespace or specially built libraries, probably mixing it
with other python native bits.
with other Python native bits.

### Composable modules

In order to support use case \#2, the Python bindings are organized into
composable modules that downstream integrators can include and re-export into
their own namespace if desired. This forces several design points:

* Separate the construction/populating of a `py::module` from
`PYBIND11_MODULE` global constructor.
* Separate the construction/populating of a `nb::module` from
`NB_MODULE` global constructor.

* Introduce headers for C++-only wrapper classes as other related C++ modules
will need to interop with it.
Expand Down Expand Up @@ -130,7 +130,7 @@ registration, etc.

### Loader

LLVM/MLIR is a non-trivial python-native project that is likely to co-exist with
LLVM/MLIR is a non-trivial Python-native project that is likely to co-exist with
other non-trivial native extensions. As such, the native extension (i.e. the
`.so`/`.pyd`/`.dylib`) is exported as a notionally private top-level symbol
(`_mlir`), while a small set of Python code is provided in
Expand Down Expand Up @@ -160,7 +160,7 @@ are) with non-RTTI polymorphic C++ code (the default compilation mode of LLVM).
### Ownership in the Core IR

There are several top-level types in the core IR that are strongly owned by
their python-side reference:
their Python-side reference:

* `PyContext` (`mlir.ir.Context`)
* `PyModule` (`mlir.ir.Module`)
Expand Down Expand Up @@ -219,23 +219,23 @@ Due to the validity and parenting accounting needs, `PyOperation` is the owner
for regions and blocks. Operations are also the only entities which are allowed to be in
a detached state.

**Note**: Multiple `PyOperation` objects (i.e., the Python objects themselves) can alias a single `mlir::Operation`.
This means, for example, if you have `py_op1` and `py_op2` which wrap the same `mlir::Operation op`
**Note**: Multiple `PyOperation` objects (i.e., the Python objects themselves) can alias a single `mlir::Operation`.
This means, for example, if you have `py_op1` and `py_op2` which wrap the same `mlir::Operation op`
and you somehow transform `op` (e.g., you run a pass on `op`) then walking the MLIR AST via either/or `py_op1`, `py_op2`
will reflect the same MLIR AST. This is perfectly safe and supported. What is not supported is invalidating any
operation while there exist multiple Python objects wrapping that operation **and then manipulating those wrappers**.
For example if `py_op1` and `py_op2` wrap the same operation under a root `py_op3` and then `py_op3` is
transformed such that the operation referenced (by `py_op1`, `py_op2`) is erased. Then `py_op1`, `py_op2`
become "undefined" in a sense; manipulating them in any way is "formally forbidden". Note, this also applies to
`SymbolTable` mutation, which is considered a transformation of the root `SymbolTable`-supporting operation for the
purposes of the discussion here. Metaphorically, one can think of this similarly to how STL container iterators are invalidated once the container itself is changed. The "best practices" recommendation is to structure your code such that
will reflect the same MLIR AST. This is perfectly safe and supported. What is not supported is invalidating any
operation while there exist multiple Python objects wrapping that operation **and then manipulating those wrappers**.
For example if `py_op1` and `py_op2` wrap the same operation under a root `py_op3` and then `py_op3` is
transformed such that the operation referenced (by `py_op1`, `py_op2`) is erased. Then `py_op1`, `py_op2`
become "undefined" in a sense; manipulating them in any way is "formally forbidden". Note, this also applies to
`SymbolTable` mutation, which is considered a transformation of the root `SymbolTable`-supporting operation for the
purposes of the discussion here. Metaphorically, one can think of this similarly to how STL container iterators are invalidated once the container itself is changed. The "best practices" recommendation is to structure your code such that

1. First, query/manipulate various Python wrapper objects `py_op1`, `py_op2`, `py_op3`, etc.;
2. Second, transform the AST/erase operations/etc. via a single root object;
3. Invalidate all queried nodes (e.g., using `op._set_invalid()`).

Ideally this should be done in a function body so that step (3) corresponds to the end of the function and there are no
risks of Python wrapper objects leaking/living longer than necessary. In summary, you should scope your changes based on
Ideally this should be done in a function body so that step (3) corresponds to the end of the function and there are no
risks of Python wrapper objects leaking/living longer than necessary. In summary, you should scope your changes based on
nesting i.e., change leaf nodes first before going up in hierarchy, and only in very rare cases query nested ops post
modifying a parent op.

Expand Down Expand Up @@ -773,7 +773,7 @@ This allows to invoke op creation of an op with a `I32Attr` with
foo.Op(30)
```

The registration is based on the ODS name but registry is via pure python
The registration is based on the ODS name but registry is via pure Python
method. Only single custom builder is allowed to be registered per ODS attribute
type (e.g., I32Attr can have only one, which can correspond to multiple of the
underlying IntegerAttr type).
Expand All @@ -795,13 +795,13 @@ either for practicality or to give the resulting library an appropriately

Generally favor converting trivial methods like `getContext()`, `getName()`,
`isEntryBlock()`, etc to read-only Python properties (i.e. `context`). It is
primarily a matter of calling `def_property_readonly` vs `def` in binding code,
primarily a matter of calling `def_prop_ro` vs `def` in binding code,
and makes things feel much nicer to the Python side.

For example, prefer:

```c++
m.def_property_readonly("context", ...)
m.def_prop_ro("context", ...)
```

Over:
Expand Down Expand Up @@ -914,17 +914,17 @@ def create_my_op():
The MLIR Python bindings integrate with the tablegen-based ODS system for
providing user-friendly wrappers around MLIR dialects and operations. There are
multiple parts to this integration, outlined below. Most details have been
elided: refer to the build rules and python sources under `mlir.dialects` for
elided: refer to the build rules and Python sources under `mlir.dialects` for
the canonical way to use this facility.

Users are responsible for providing a `{DIALECT_NAMESPACE}.py` (or an equivalent
directory with `__init__.py` file) as the entrypoint.

### Generating `_{DIALECT_NAMESPACE}_ops_gen.py` wrapper modules

Each dialect with a mapping to python requires that an appropriate
Each dialect with a mapping to Python requires that an appropriate
`_{DIALECT_NAMESPACE}_ops_gen.py` wrapper module is created. This is done by
invoking `mlir-tblgen` on a python-bindings specific tablegen wrapper that
invoking `mlir-tblgen` on a Python-bindings specific tablegen wrapper that
includes the boilerplate and actual dialect specific `td` file. An example, for
the `Func` (which is assigned the namespace `func` as a special case):

Expand Down Expand Up @@ -954,7 +954,7 @@ from ._my_dialect_ops_gen import *

### Extending the search path for wrapper modules

When the python bindings need to locate a wrapper module, they consult the
When the Python bindings need to locate a wrapper module, they consult the
`dialect_search_path` and use it to find an appropriately named module. For the
main repository, this search path is hard-coded to include the `mlir.dialects`
module, which is where wrappers are emitted by the above build rule. Out of tree
Expand Down Expand Up @@ -1153,7 +1153,7 @@ subclasses can be defined using
[`include/mlir/Bindings/Python/PybindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/PybindAdaptors.h)
or
[`include/mlir/Bindings/Python/NanobindAdaptors.h`](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Bindings/Python/NanobindAdaptors.h)
utilities that mimic pybind11/nanobind API for defining functions and
utilities that mimic pybind11/nanobind APIs for defining functions and
properties. These bindings are to be included in a separate module. The
utilities also provide automatic casting between C API handles `MlirAttribute`
and `MlirType` and their Python counterparts so that the C API handles can be
Expand All @@ -1176,11 +1176,11 @@ are available when the dialect is loaded from Python.
Dialect-specific passes can be made available to the pass manager in Python by
registering them with the context and relying on the API for pass pipeline
parsing from string descriptions. This can be achieved by creating a new
pybind11 module, defined in `lib/Bindings/Python/<Dialect>Passes.cpp`, that
nanobind module, defined in `lib/Bindings/Python/<Dialect>Passes.cpp`, that
calls the registration C API, which must be provided first. For passes defined
declaratively using Tablegen, `mlir-tblgen -gen-pass-capi-header` and
`-mlir-tblgen -gen-pass-capi-impl` automate the generation of C API. The
pybind11 module must be compiled into a separate “Python extension” library,
nanobind module must be compiled into a separate “Python extension” library,
which can be `import`ed from the main dialect file, i.e.
`python/mlir/dialects/<dialect-namespace>.py` or
`python/mlir/dialects/<dialect-namespace>/__init__.py`, or from a separate
Expand Down
Loading