|
| 1 | +Developer guide — pyalp / alp-graphblas |
| 2 | +===================================== |
| 3 | + |
| 4 | +Author: |
| 5 | +Denis Jelovina |
| 6 | + |
| 7 | +Support: |
| 8 | +For support or to report issues, please open an issue on the project's GitHub issue tracker. For direct contact, email [email protected] |
| 9 | + |
| 10 | +This document explains how the Python packaging for the pyalp bindings works, how CI builds wheels, and what to change when you add a new compiled backend (pybind11 module) or Python dependency. |
| 11 | + |
| 12 | +C++ binding logic and Python usage (summary) |
| 13 | +------------------------------------------- |
| 14 | +The pyalp package exposes native C++ backends built with pybind11. Each backend is compiled as a separate Python extension module (shared object) with a canonical name like `pyalp_ref`, `pyalp_omp`, or `pyalp_nonblocking`. The packaging layout installs those compiled modules into the `pyalp` package so they are importable as `pyalp.pyalp_ref`, `pyalp.pyalp_omp`, etc. |
| 15 | + |
| 16 | +How Python code uses the compiled backends |
| 17 | +- Direct import: after installation you can import a backend module directly, for example: |
| 18 | + |
| 19 | + import pyalp.pyalp_ref |
| 20 | + M = pyalp.pyalp_ref.Matrix(10, 10) |
| 21 | + |
| 22 | +- Helper API: the package also provides helper APIs that discover and return backends at runtime, e.g. `pyalp_importname.get_backend('pyalp_ref')` which returns the compiled module object. This is useful for selecting backends dynamically. |
| 23 | + |
| 24 | +How the Python object maps to C++ |
| 25 | +- Each compiled extension is a pybind11 module which registers C++ types (Matrix, Vector, operators) and functions. The pybind11 binding code (in the `pyalp` C++ sources) defines the Python-visible class names and methods, so `pyalp.pyalp_ref.Matrix` is a python wrapper around the C++ Matrix implementation in the native backend. |
| 26 | +- At build time, CMake compiles the C++ sources into a platform-specific shared object; the packaging step copies that shared object into the `pyalp` package so the interpreter can import it as a normal module. |
| 27 | + |
| 28 | +Current functional limitations and caveats |
| 29 | +- Cross-backend imports: importing different backend modules in the same Python process can cause pybind11 type-registration collisions (duplicate registrations of the same C++ types across modules). The bindings now use `py::module_local()` for many wrapper types to reduce collisions, but issues can still occur. If you need repeatable cross-backend usage, either run backends in separate processes or design a shared-registration approach (single module that dispatches to backends or explicit shared-type registration across modules). |
| 30 | +- Cross-backend bindings: supporting full cross-backend interoperability requires either |
| 31 | + - a single compiled extension exporting a stable API and selecting backends internally, or |
| 32 | + - explicit cross-registration code that ensures each type is only registered once (or registered with module-local variants and safe conversion functions). Both approaches require C++ changes and careful testing. |
| 33 | +- Wheel portability and optimization trade-offs: |
| 34 | + - Wheels are built per-ABI and per-OS (CI uses per-ABI build dirs). The project disables aggressive target-specific flags (no `-march=native`, LTO off) to improve portability, but wheels are still platform/ABI-specific (glibc versus musl, macOS SDK versions). Expect different wheel filenames per ABI/OS and possible limitations on older OS versions. |
| 35 | + - CI currently skips `*-musllinux*` and does not publish Windows wheels by default (see CI matrix). If you need musl or Windows support, update the CI configuration and the before-build steps to provide appropriate toolchains and packaging options. |
| 36 | +- Size and dependency implications: bundling multiple backends increases wheel size. |
| 37 | + |
| 38 | +If you plan to change the bindings or support cross-backend imports, read the `pybind11` docs on module-local registrations and consider writing small integration tests that import multiple backends in isolated subprocesses. |
| 39 | + |
| 40 | + |
| 41 | +Local builds (tested with `pyalp-ci.yml`) |
| 42 | +----------------------------------------- |
| 43 | +If you prefer fast iteration or want to debug native build issues locally, build and test wheels on your machine. The repository provides `pyalp-ci.yml` to exercise the build steps in CI (useful to validate local changes on pull requests), but local builds let you iterate without pushing tags or waiting for remote runners. |
| 44 | + |
| 45 | +When to build locally |
| 46 | +- Fast iteration when changing bindings, packaging logic, or test code. |
| 47 | +- Debugging native-build problems where you need immediate access to compiler and linker output. |
| 48 | +- Packaging-only checks: point `pyalp/setup.py` at an existing `.so` (via `PREBUILT_PYALP_SO`) to validate wheel contents without rebuilding native code. |
| 49 | + |
| 50 | +How to build wheels locally (quick recipe) |
| 51 | +- Prepare a per-ABI build directory and run CMake (example for Python 3.11): |
| 52 | +- Build a wheel from the `pyalp` package and point it at the per-ABI build dir so the generated metadata and prebuilt `.so` get picked up: |
| 53 | + |
| 54 | +```bash |
| 55 | + cmake -DENABLE_PYALP=ON -DCMAKE_BUILD_TYPE=Release $ALP_REPO_PATH |
| 56 | + make pyalp_ref |
| 57 | + # append the new path to PYTHONPATH, ie. export PYTHONPATH=$PYTHONPATH:$(pwd)/python |
| 58 | +``` |
| 59 | + |
| 60 | +Advantage of local builds |
| 61 | +- Performance, active optimisations for the build architecture |
| 62 | +- Speed: no remote queue or tag/push cycle. |
| 63 | +- Control: change CMake flags and environment variables and rebuild immediately. |
| 64 | +- Debuggability: full compiler/linker logs and the ability to attach tools. |
| 65 | + |
| 66 | + |
| 67 | +Full publish pipeline (publish-to-testpypi.yml + promote-to-pypi.yml) |
| 68 | +----------------------------------------------------------------- |
| 69 | +The full repository publish flow is implemented in two primary workflows: |
| 70 | + |
| 71 | +- `publish-to-testpypi.yml` — builds wheels for multiple ABIs/OSes using `cibuildwheel`, publishes them to TestPyPI, uploads wheel artifacts to a GitHub Release, and runs verification steps that install the TestPyPI package into a clean virtualenv for smoke tests. This workflow is triggered by pushing a tag matching `pyalp.v*`. |
| 72 | + |
| 73 | +- `promote-to-pypi.yml` — a gated workflow that downloads wheel assets from a GitHub Release and uploads them to PyPI. This job requires the `production` environment and uses the `PYPI_API_TOKEN` secret; the environment gating ensures human approval before the token is available to the workflow. |
| 74 | + |
| 75 | +Key differences vs local builds |
| 76 | +- Scope: the publish pipelines run multiple ABIs and platforms, produce canonical release artifacts, and publish them to TestPyPI/PyPI. |
| 77 | +- Reproducibility: CI uses standard manylinux containers and controlled macOS runners to produce wheels intended for distribution; this reduces host-specific variation. |
| 78 | +- Approval and secrets: promote-to-pypi requires an environment approval to access the PyPI token, preventing accidental publishes. |
| 79 | + |
| 80 | +When to use the publish pipeline |
| 81 | +- After local validation and CI runs (e.g., `pyalp-ci.yml` for PRs), create an annotated tag `pyalp.vX.Y.Z` and push it to trigger `publish-to-testpypi.yml`. |
| 82 | +- Once TestPyPI artifacts are validated, run `promote-to-pypi.yml` (workflow dispatch) to publish to PyPI; this step requires environment approval and the presence of the `PYPI_API_TOKEN` secret. |
| 83 | + |
| 84 | +Operational note: TestPyPI propagation and verification |
| 85 | +- The verification step that installs wheels from TestPyPI can occasionally fail due to propagation delays between upload and index availability. If the TestPyPI install step fails transiently, re-run the workflow or re-trigger the release; the promote job should only be run once test artifacts are available and verified. |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | + |
| 90 | +High-level contract |
| 91 | +- Inputs: CMake-based native backends built by the top-level CMake tree, a generated Python metadata file produced by CMake, and the Python package source in `pyalp/src`. |
| 92 | +- Output: Platform-specific wheels that contain the compiled shared object(s) and a generated `_metadata.py` file. The published PyPI project name is `alp-graphblas`, but the import name inside Python remains `pyalp`. |
| 93 | +- Success criteria: pip install alp-graphblas (from TestPyPI or PyPI) yields a package exposing `pyalp.get_build_metadata()` and one or more backend modules accessible via `pyalp.get_backend(<name>)`. |
| 94 | + |
| 95 | +Where things live (important files) |
| 96 | +- `pyalp/pyproject.toml` — project metadata used by CI and for the package release (project name, version, runtime dependencies such as numpy). |
| 97 | +- `pyalp/setup.py` — custom setuptools glue. It either copies prebuilt shared objects from the CMake build tree into the wheel (preferred for CI-built wheels) or builds from source with pybind11 when no prebuilt artifact is present. |
| 98 | +- `pyalp/src/pyalp/_metadata.py.in` — CMake template used to generate `pyalp_metadata.py` (copied into wheels as `_metadata.py`). If you change the runtime metadata shape, update this file and the code that reads it. |
| 99 | +- Top-level CMake files (`CMakeLists.txt` and `src/…`) — define native targets such as `pyalp_ref`, `pyalp_omp`, `pyalp_nonblocking`. CI runs a top-level CMake configure/build per-Python-ABI and produces the native `.so` files and the generated metadata file. |
| 100 | +- `.github/workflows/publish-to-testpypi.yml` — builds wheels with cibuildwheel and publishes to TestPyPI (trigger: push tag `pyalp.v*`). This workflow also creates a GitHub Release with wheel assets. |
| 101 | +- `.github/workflows/promote-to-pypi.yml` — promotes a GitHub Release's wheel assets to PyPI. The job requires `environment: production` (see repository settings) and uses the secret `PYPI_API_TOKEN`. |
| 102 | +- `.github/scripts/` — helper scripts used by CI (e.g., verification and TestPyPI wait scripts). |
| 103 | + |
| 104 | +How the CI build produces a wheel (brief) |
| 105 | +- cibuildwheel is used to produce wheels for multiple Python ABIs and OSes. |
| 106 | +- Before building each wheel, CI runs a `CIBW_BEFORE_BUILD` script which: |
| 107 | + - Installs CMake + Ninja inside the build container. |
| 108 | + - Derives Git and package version metadata and sets environment variables. |
| 109 | + - Configures a per-ABI CMake build directory (e.g. `build/cp311`) and runs CMake to produce the compiled backends and a generated `pyalp_metadata.py` file inside that build dir. |
| 110 | + - Exports `CMAKE_BUILD_DIR` pointing to the per-ABI build directory so `pyalp/setup.py` can locate the generated outputs. |
| 111 | +- The packaging step runs `pyalp/setup.py` (setup.py will copy discovered prebuilt `.so` files and the generated metadata file into the package build directory). The wheel built by cibuildwheel therefore contains the prebuilt, ABI-specific `.so` and `_metadata.py`. |
| 112 | + |
| 113 | +How `pyalp/setup.py` cooperates with CMake |
| 114 | +- By default `setup.py` searches the repo `../build/**` tree for prebuilt shared objects named like the native targets (`pyalp_ref`, `pyalp_omp`, `pyalp_nonblocking`). If it finds them it adds Extension entries with empty sources and uses a custom `build_ext` to copy the prebuilt library into the wheel. |
| 115 | +- `setup.py` looks for the generated metadata file in the directory pointed to by the `CMAKE_BUILD_DIR` environment variable (set by the CI before_build script). If present it copies `pyalp_metadata.py` -> `_metadata.py` next to the extension in the wheel. |
| 116 | +- If no prebuilt modules are detected and `pybind11` is available, `setup.py` will fall back to building from sources with pybind11. |
| 117 | +- Environment variables you can use locally: |
| 118 | + - `CMAKE_BUILD_DIR` — path to the per-ABI CMake build dir that contains `pyalp_metadata.py` and the built `.so` files. |
| 119 | + - `PREBUILT_PYALP_SO` or `PYALP_PREBUILT_SO` — point to a single prebuilt shared object to include in the wheel (helpful for local testing). |
| 120 | + |
| 121 | +Adding a new compiled backend (step-by-step) |
| 122 | +1) Add a CMake target |
| 123 | + - Add a target to your CMake configuration (top-level CMake or `pyalp` subdirectory). Name it with the prefix used by `setup.py` (for example `pyalp_mybackend` if you want the backend import name to be `pyalp_mybackend`). |
| 124 | + - Ensure the target produces a shared library file named so that it will be discoverable by the existing glob in `pyalp/setup.py` (the packaging code looks for `build/**/<target>*.(so|pyd)`). |
| 125 | + - If the backend needs additional compile flags or third-party deps, add those to the CMake target and to the cibuildwheel before-build step where platform-specific dependencies are installed. |
| 126 | + |
| 127 | +2) Expose the pybind11 module name correctly |
| 128 | + - The module name that Python imports must match the filename stem: for a target `pyalp_mybackend` the shared object should become something like `pyalp_mybackend.cpython-311-x86_64-linux-gnu.so` and will be installed into the `pyalp` package as `pyalp/mybackend` importable as `pyalp.pyalp_mybackend` or accessed by the helper APIs. |
| 129 | + - `setup.py` maps module names to the extension name `pyalp.<module_name>`; if you introduce a module with a different naming scheme, update `pyalp/setup.py`'s discovery or add an explicit mapping. |
| 130 | + |
| 131 | +3) Update CI build targets |
| 132 | + - The cibuildwheel `CIBW_BEFORE_BUILD` script exports a `BUILD_TARGETS` variable used by CMake to restrict which targets to build. Edit `.github/workflows/publish-to-testpypi.yml` under `CIBW_BEFORE_BUILD` to include your new target name in `BUILD_TARGETS`. |
| 133 | + - If your backend requires platform-specific dependency installation (e.g., libnuma, libomp) ensure those package installs are available in the before-build block. |
| 134 | + |
| 135 | +4) Update packaging helpers if needed |
| 136 | + - If your module uses a new stem that the setup script won't detect, add the module name to the `supported` list in `pyalp/setup.py` or rely on the glob search. |
| 137 | + - If you want to bundle multiple backends under a different naming convention, update `find_all_prebuilt()` discovery logic and the code that constructs `Extension(f"pyalp.{modname}")` entries. |
| 138 | + |
| 139 | +5) Add/adjust tests |
| 140 | + - Add small smoke tests (ideally under `tests/python/` or `tests/smoke/`) that run the new backend. Prefer running each backend in its own process where feasible to avoid pybind11 registration collisions. |
| 141 | + |
| 142 | +6) Build and test locally (quick recipe) |
| 143 | + - Ensure system deps installed: cmake, ninja, a C++ toolchain and any library dependencies. |
| 144 | + - Create a per-ABI build dir and configure CMake as CI does. Example (for Python 3.11): |
| 145 | + ```bash |
| 146 | + mkdir -p build/cp311 |
| 147 | + cmake -S . -B build/cp311 -G Ninja -DCMAKE_BUILD_TYPE=Release -DENABLE_PYALP=ON -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DPython3_EXECUTABLE=$(which python3) |
| 148 | + cmake --build build/cp311 --target pyalp_ref pyalp_mybackend --parallel |
| 149 | + ``` |
| 150 | + |
| 151 | + - Build a wheel locally from the `pyalp` package. From the repository root: |
| 152 | + ```bash |
| 153 | + export CMAKE_BUILD_DIR="$(pwd)/build/cp311" |
| 154 | + cd pyalp |
| 155 | + # Build a wheel using the package directory's setup.py |
| 156 | + python -m pip wheel . --no-deps -w ../wheelhouse |
| 157 | +
|
| 158 | + # Install and test the wheel in a fresh venv |
| 159 | + python -m venv /tmp/venv_test |
| 160 | + source /tmp/venv_test/bin/activate |
| 161 | + python -m pip install --upgrade pip |
| 162 | + python -m pip install ../wheelhouse/alp-graphblas-*.whl |
| 163 | + ``` |
| 164 | + |
| 165 | + - Note: `--no-deps` is optional when building locally; published wheels should contain runtime dependency metadata so that pip will pull `numpy` automatically. |
| 166 | + |
| 167 | +Releases and publishing (how CI is wired) |
| 168 | +- Creating a TestPyPI release (normal path): |
| 169 | + 1. Bump the version in `pyalp/pyproject.toml` (recommended) and commit. |
| 170 | + 2. Create a git tag of the form `pyalp.vX.Y.Z` and push the tag. The `publish-to-testpypi.yml` workflow is triggered on push tags matching `pyalp.v*`. |
| 171 | + 3. The workflow builds wheels (cibuildwheel), uploads wheel artifacts as GitHub workflow artifacts, publishes to TestPyPI, and creates a GitHub Release with the wheel assets. |
| 172 | + |
| 173 | + - Promoting to PyPI (two-step gated publish): |
| 174 | + - The `publish-to-testpypi.yml` workflow automatically builds and deploys wheels to TestPyPI and then attempts to install and verify those wheels in a fresh virtual environment. Occasionally this verification can fail due to propagation delays between upload and availability; if that happens, re-run the workflow (or re-trigger the release) until the verification completes successfully. |
| 175 | + - The `promote-to-pypi.yml` workflow is triggered manually (`workflow_dispatch`) and it is enabld only with the `pyalp.v*` tag. It downloads the assets attached to the GitHub Release and uploads them to PyPI using the secret `PYPI_API_TOKEN`. |
| 176 | + - The promote job is configured to use the repository `production` environment. Access to the `PYPI_API_TOKEN` secret in that environment requires an approval step by repository administrators (see Settings → Environments → production). |
| 177 | + |
| 178 | +Checklist before releasing |
| 179 | +- Bump `pyalp/pyproject.toml` version. |
| 180 | +- Ensure `pyalp/pyproject.toml` includes runtime dependencies (e.g., `numpy>=1.22`) so pip installs them automatically. |
| 181 | +- Ensure `CIBW_BEFORE_BUILD` in `.github/workflows/publish-to-testpypi.yml` builds your new backend (`BUILD_TARGETS` updated). |
| 182 | +- If your backend needs extra system packages (libnuma, libomp, etc.), add those install steps to the before-build script or document the manual requirements. |
| 183 | +- Add smoke tests that import and exercise the backend. Run them against installed wheels (CI verifies installed wheels in a separate job). |
| 184 | +- Create the tag `pyalp.vX.Y.Z` and push it; observe the `alp-graphblas wheels (cibuildwheel)` workflow. |
| 185 | + |
| 186 | +Troubleshooting / common pitfalls |
| 187 | +- Missing metadata in wheels: Make sure CMake writes the generated `pyalp_metadata.py` into the per-ABI build dir (CI sets `CMAKE_BUILD_DIR` and `setup.py` copies `pyalp_metadata.py` -> `_metadata.py`). If your metadata template changed, update `pyalp/src/pyalp/_metadata.py.in`. |
| 188 | +- Prebuilt `.so` not found: `pyalp/setup.py` discovers prebuilt shared objects under `build/**`. Ensure you used the same target name and that the produced filename contains the Python ABI tag (or set `PREBUILT_PYALP_SO` to the path). |
| 189 | +- ABI contamination across wheels: CI uses per-ABI build directories (e.g. `build/cp311`) to avoid cross-ABI contamination. When testing locally, clean build dirs between ABI runs. |
| 190 | +- pybind11 registration collisions: If you see type-registration errors when importing multiple different backends in the same process, prefer running backends in separate processes or ensure pybind11 wrappers use `py::module_local()` for types that may be defined in multiple modules. |
| 191 | +
|
| 192 | +Security notes |
| 193 | +- The promotion workflow uses a `PYPI_API_TOKEN` stored as a secret (likely in the repository environment `production`). If you did not create this token yourself, check: |
| 194 | + - Repository Settings → Secrets and variables → Actions |
| 195 | + - Environments → production → Secrets |
| 196 | + - Organization-level secrets (if applicable) |
| 197 | +- Rotate/revoke tokens if you discover an unexpected token. |
| 198 | +
|
| 199 | +Appendix — quick pointers to edit points |
| 200 | +- Add CMake target: top-level CMake / `pyalp/src` CMakeLists. |
| 201 | +- Ensure discovery in `pyalp/setup.py`: supported names in `find_all_prebuilt()` and the glob-based discovery. |
| 202 | +- Include generated metadata: `pyalp/src/pyalp/_metadata.py.in` (CMake variables are substituted into this template). |
| 203 | +- CI build targets: `.github/workflows/publish-to-testpypi.yml` (search for `BUILD_TARGETS` and `BACKEND_FLAGS` in `CIBW_BEFORE_BUILD`). |
| 204 | +- Promote workflow: `.github/workflows/promote-to-pypi.yml` (uses `PYPI_API_TOKEN` and `environment: production`). |
| 205 | +
|
| 206 | +
|
0 commit comments