Skip to content

Conversation

@JDevlieghere
Copy link
Member

@JDevlieghere JDevlieghere commented Nov 14, 2025

This is a continuation of my effort [1] to eliminate uses of the private Python API in LLDB. I found 3 more issues that were hidden in the typemap, because those were included before we defined Py_LIMITED_API.

The primary motivator for bumping the minimum required Python version is our use of pybuffer (i.e. the Buffer Protocol).

  • For Python versions prior to Python 3.11, your only option was to target the "Old Buffer Protocol" [2], which has been deprecated since Python 3 and removed in Python 3.13, though the symbols still exist for ABI compatibility.

  • For Python versions 3.13 and later, because of the removal, your only option is to target the "New Buffer Protocol" [3] which was only added to the stable API in Python 3.11.

This leads to certain combination of Python versions for building against, and targeting with the limited API, that are not compatible. For example, you cannot target the 3.8 stable API when building against Python 3.13 because the new version of Python doesn't declare the old APIs and you cannot use the new APIs yet.

To make this more complicated, the use of the buffer protocol is coming from SWIG. For unrelated reasons, to work around a bug in SWIG < 4.1, we already reimplement the swig type wrapper for the Buffer Protocol in a way that's compatible with Python 3.11 and later. It's only when I tried to remove the workaround in #167808 (because it was incompatible with the Python 3.8 Limited C API) that I discovered that SWIG gets this wrong (though it has been fixed on the master branch [4][5]). Instead of waiting for a SWIG release with the fix (and bumping the minimum SWIG version), I propose we keep the workaround. Regardless of the approach, it requires bumping the minimum Python version, when targeting the limited C API, to 3.11.

[1] #151617
[2] https://docs.python.org/3.12/c-api/objbuffer.html
[3] https://docs.python.org/3/c-api/buffer.html
[4] swig/swig@05c439d
[5] swig/swig@8710367

This is a continuation of my effort [1] to eliminate uses of the private
Python API in LLDB. I found 3 more issues that were hidden in the
typemap, because those were included before we defined Py_LIMITED_API.

The primary motivator for bumping the minimum required Python version is
our use of `pybuffer` (i.e. the Buffer Protocol).

 - For Python versions prior to Python 3.11, your only option was to
   target the "Old Buffer Protocol" [2], which has been deprecated since
   Python 3 and removed in Python 3.13, though the symbols still exist
   for ABI compatibility.

 - For Python versions 3.13 and later, because of the removal, your only
   option is to target the "New Buffer Protocol" [3] which was only
   added to the stable API in Python 3.11.

This leads to certain combination of Python versions for building
against and targeting with the limited API that are not compatible. For
example, you cannot target the 3.8 stable API when building against
Python 3.13 because the new version of Python doesn't declare the old
APIs and you cannot use the new APIs yet.

To make this more complicated, the use of the buffer protocol is coming
from SWIG. Furthermore, all released versions of SWIG get this wrong.
For unrelated reasons, to work around a bug in SWIG < 4.1, we already
reimplement the swig type wrapper for the Buffer Protocol in a way
that's compatible with Python 3.11 and later. Instead of waiting for a
SWIG release that fixes this issue (it's already fixed on the master
branch) we can keep the workaround and bump the minimum Python version
when targeting the limited C API.

[1] llvm#151617
[2] https://docs.python.org/3.12/c-api/objbuffer.html
[3] https://docs.python.org/3/c-api/buffer.html
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2025

@llvm/pr-subscribers-lldb

Author: Jonas Devlieghere (JDevlieghere)

Changes

This is a continuation of my effort [1] to eliminate uses of the private Python API in LLDB. I found 3 more issues that were hidden in the typemap, because those were included before we defined Py_LIMITED_API.

The primary motivator for bumping the minimum required Python version is our use of pybuffer (i.e. the Buffer Protocol).

  • For Python versions prior to Python 3.11, your only option was to target the "Old Buffer Protocol" [2], which has been deprecated since Python 3 and removed in Python 3.13, though the symbols still exist for ABI compatibility.

  • For Python versions 3.13 and later, because of the removal, your only option is to target the "New Buffer Protocol" [3] which was only added to the stable API in Python 3.11.

This leads to certain combination of Python versions for building against and targeting with the limited API that are not compatible. For example, you cannot target the 3.8 stable API when building against Python 3.13 because the new version of Python doesn't declare the old APIs and you cannot use the new APIs yet.

To make this more complicated, the use of the buffer protocol is coming from SWIG. For unrelated reasons, to work around a bug in SWIG < 4.1 (*), we already reimplement the swig type wrapper for the Buffer Protocol in a way that's compatible with Python 3.11 and later. It's only when I tried to remove the workaround in #167808 (because it was incompatible with the Python 3.8 Limited C API) that I discovered that SWIG gets this wrong (though it has been fixed on the master branch [4][5]). Instead of waiting for a SWIG release with the fix (and bumping the minimum SWIG version), I propose we keep the workaround and instead bump the minimum Python version when targeting the limited C API to 3.11.

[1] #151617
[2] https://docs.python.org/3.12/c-api/objbuffer.html
[3] https://docs.python.org/3/c-api/buffer.html
[4] swig/swig@05c439d
[5] swig/swig@8710367


Full diff: https://github.com/llvm/llvm-project/pull/167990.diff

5 Files Affected:

  • (modified) lldb/bindings/python/python-typemaps.swig (+9-10)
  • (modified) lldb/bindings/python/python-wrapper.swig (+5-2)
  • (modified) lldb/bindings/python/python.swig (+5)
  • (modified) lldb/cmake/modules/LLDBConfig.cmake (+4-2)
  • (modified) lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h (+4-3)
diff --git a/lldb/bindings/python/python-typemaps.swig b/lldb/bindings/python/python-typemaps.swig
index 3baeaa7770e6c..f73ab768b4c42 100644
--- a/lldb/bindings/python/python-typemaps.swig
+++ b/lldb/bindings/python/python-typemaps.swig
@@ -220,9 +220,9 @@ AND call SWIG_fail at the same time, because it will result in a double free.
 }
 
 // Disable default type checking for this method to avoid SWIG dispatch issues.
-// 
+//
 // Problem: SBThread::GetStopDescription has two overloads:
-//   1. GetStopDescription(char* dst_or_null, size_t dst_len) 
+//   1. GetStopDescription(char* dst_or_null, size_t dst_len)
 //   2. GetStopDescription(lldb::SBStream& stream)
 //
 // SWIG generates a dispatch function to select the correct overload based on argument types.
@@ -230,9 +230,9 @@ AND call SWIG_fail at the same time, because it will result in a double free.
 // However, this dispatcher doesn't consider typemaps that transform function signatures.
 //
 // In Python, our typemap converts GetStopDescription(char*, size_t) to GetStopDescription(int).
-// The dispatcher still checks against the original (char*, size_t) signature instead of 
+// The dispatcher still checks against the original (char*, size_t) signature instead of
 // the transformed (int) signature, causing type matching to fail.
-// This only affects SBThread::GetStopDescription since the type check also matches 
+// This only affects SBThread::GetStopDescription since the type check also matches
 // the argument name, which is unique to this function.
 %typemap(typecheck, precedence=SWIG_TYPECHECK_POINTER) (char *dst_or_null, size_t dst_len) ""
 
@@ -628,12 +628,11 @@ template <> bool SetNumberFromPyObject<double>(double &number, PyObject *obj) {
   }
 }
 
-// These two pybuffer macros are copied out of swig/Lib/python/pybuffer.i,
-// and fixed so they will not crash if PyObject_GetBuffer fails.
-// https://github.com/swig/swig/issues/1640
-//
-// I've also moved the call to PyBuffer_Release to the end of the SWIG wrapper,
-// doing it right away is not legal according to the python buffer protocol.
+// These two pybuffer macros are copied out of swig/Lib/python/pybuffer.i.
+// - Avoids crash if PyObject_GetBuffer fails and delays calling
+//   PyBuffer_Release (https://github.com/swig/swig/issues/1640)
+// - Avoids using deprecated "old buffer protocol" APIs when targeting the
+//   Stable C API.
 %inline %{
 struct Py_buffer_RAII {
   Py_buffer buffer = {};
diff --git a/lldb/bindings/python/python-wrapper.swig b/lldb/bindings/python/python-wrapper.swig
index 3a0995e84f643..9b9245eac14ce 100644
--- a/lldb/bindings/python/python-wrapper.swig
+++ b/lldb/bindings/python/python-wrapper.swig
@@ -128,8 +128,11 @@ bool lldb_private::python::SWIGBridge::LLDBSwigPythonCallTypeScript(
 
   PyObject *pfunc_impl = nullptr;
 
-  if (pyfunct_wrapper && *pyfunct_wrapper &&
-      PyFunction_Check(*pyfunct_wrapper)) {
+  if (pyfunct_wrapper && *pyfunct_wrapper 
+#ifndef Py_LIMITED_API
+      && PyFunction_Check(*pyfunct_wrapper)
+#endif
+      ) {
     pfunc_impl = (PyObject *)(*pyfunct_wrapper);
     if (pfunc_impl->ob_refcnt == 1) {
       Py_XDECREF(pfunc_impl);
diff --git a/lldb/bindings/python/python.swig b/lldb/bindings/python/python.swig
index 4a5a39dc4b06d..b2823f98acac8 100644
--- a/lldb/bindings/python/python.swig
+++ b/lldb/bindings/python/python.swig
@@ -59,6 +59,11 @@ except ImportError:
 // Parameter types will be used in the autodoc string.
 %feature("autodoc", "1");
 
+// Include lldb-python first as it sets Py_LIMITED_API.
+%begin %{
+#include "../source/Plugins/ScriptInterpreter/Python/lldb-python.h"
+%}
+
 %pythoncode%{
 import uuid
 import re
diff --git a/lldb/cmake/modules/LLDBConfig.cmake b/lldb/cmake/modules/LLDBConfig.cmake
index 4b568d27c4709..0abaa9fb763e7 100644
--- a/lldb/cmake/modules/LLDBConfig.cmake
+++ b/lldb/cmake/modules/LLDBConfig.cmake
@@ -180,12 +180,14 @@ if (LLDB_ENABLE_PYTHON)
       "Path to use as PYTHONHOME in lldb. If a relative path is specified, it will be resolved at runtime relative to liblldb directory.")
   endif()
 
-  if (SWIG_VERSION VERSION_GREATER_EQUAL "4.2" AND NOT LLDB_EMBED_PYTHON_HOME)
+  if (SWIG_VERSION VERSION_GREATER_EQUAL "4.2"
+      Python3_VERSION VERSION_GREATER_EQUAL "3.11"
+      AND NOT LLDB_EMBED_PYTHON_HOME)
     set(default_enable_python_limited_api ON)
   else()
     set(default_enable_python_limited_api OFF)
   endif()
-  option(LLDB_ENABLE_PYTHON_LIMITED_API "Force LLDB to only use the Python Limited API (requires SWIG 4.2 or later)"
+  option(LLDB_ENABLE_PYTHON_LIMITED_API "Only use the Python Limited API (requires at least Python 3.11 and SWIG 4.2)"
     ${default_enable_python_limited_api})
 else()
   # Even if Python scripting is disabled, we still need a Python interpreter to
diff --git a/lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h b/lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h
index 0c281793613a5..98941f504e44c 100644
--- a/lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h
+++ b/lldb/source/Plugins/ScriptInterpreter/Python/lldb-python.h
@@ -45,12 +45,13 @@ static llvm::Expected<bool> *g_fcxx_modules_workaround [[maybe_unused]];
 #include <locale>
 #endif
 
-#define LLDB_MINIMUM_PYTHON_VERSION 0x03080000
-
-#if LLDB_ENABLE_PYTHON_LIMITED_API
 // If defined, LLDB will be ABI-compatible with all Python 3 releases from the
 // specified one onward, and can use Limited API introduced up to that version.
+#if LLDB_ENABLE_PYTHON_LIMITED_API
+#define LLDB_MINIMUM_PYTHON_VERSION 0x030b0000
 #define Py_LIMITED_API LLDB_MINIMUM_PYTHON_VERSION
+#else
+#define LLDB_MINIMUM_PYTHON_VERSION 0x03080000
 #endif
 
 // Include python for non windows machines

set(default_enable_python_limited_api OFF)
endif()
option(LLDB_ENABLE_PYTHON_LIMITED_API "Force LLDB to only use the Python Limited API (requires SWIG 4.2 or later)"
option(LLDB_ENABLE_PYTHON_LIMITED_API "Only use the Python Limited API (requires at least Python 3.11 and SWIG 4.2)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to print a diagnostic if someone uses -DLLDB_ENABLE_PYTHON_LIMITED_API=On with old python/SWIG?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an older Python will get caught by the static_assert in lldb-python.h. An older version of SWIG wouldn't, but I also don't necessarily want to prevent it when someone explicitly provided the override.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. (My thinking is that there are two ways someone would want to use the option: one, to force it off because the new API usage causes some issue, or two, force it on to enforce a particular build config. In the latter case, you really want to catch the mistake as soon as possible, instead of giving some weird build error later.)

Copy link
Member

@bulbazord bulbazord Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a good middle ground may be to emit a warning? Something like "Hey you did this unsupported thing and now you're on your own"

Copy link
Member Author

@JDevlieghere JDevlieghere Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the PR to diagnose the issues at configuration time. I used SEND_ERROR so that if there's multiple issues, you don't need to keep reconfiguring. I think the target audience that would want to work around this and try different versions is most likely just me and maybe a handful of other LLDB devs who can easily turn the error into a warning while working on it.

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I'm a bit surprised at the lengths we need to go to support a stable python API but I guess I shouldn't be. 😄

@medismailben
Copy link
Member

Instead of using PyBuffer, since this is mostly used for char*, size_t maybe you can use PyBytes_AsStringAndSize prior to python 3.11

Copy link
Member

@medismailben medismailben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we can avoid bumping the python version by using PyBytes_AsStringAndSize instead of PyBuffer

@JDevlieghere
Copy link
Member Author

That's a good point. I eliminated the use of PyBuffer in the PythonDataObjects by using PyBytes instead. Maybe we can do the same thing for the typemap. Let me give that a shot.

@JDevlieghere
Copy link
Member Author

That's a good point. I eliminated the use of PyBuffer in the PythonDataObjects by using PyBytes instead. Maybe we can do the same thing for the typemap. Let me give that a shot.

As it turns out, the answer is no. The goal of this typemap is to support anything implementing the buffer protocol. In order to do that, we need a Python API to get the data out of something that implements the protocol, so there's really no other way to achieve this.

@medismailben
Copy link
Member

That's a good point. I eliminated the use of PyBuffer in the PythonDataObjects by using PyBytes instead. Maybe we can do the same thing for the typemap. Let me give that a shot.

As it turns out, the answer is no. The goal of this typemap is to support anything implementing the buffer protocol. In order to do that, we need a Python API to get the data out of something that implements the protocol, so there's really no other way to achieve this.

AFAICT, the only APIs that uses those bindings are SBFile::{Read,Write}. Moving them to a PyBytes or PyByteArray would be fine those are native types that are compatible with the buffer protocol.

Certain objects available in Python wrap access to an underlying memory array or buffer. Such objects include the built-in bytes and bytearray, and some extension types like array.array.

@JDevlieghere
Copy link
Member Author

JDevlieghere commented Nov 14, 2025

Correct:

  • This is only used in SBFile::Read and SBFile::Write.
  • PyBytes, PyByteArray and memoryview are the built-in types that confirm to the Buffer Protocol.

So you're suggesting that we change SBFile::Read and SBFile::Write to only support PyBytes and PyByteArray and drop support for other types that support the buffer protocol? I think that's also a reasonable path forward.

JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Nov 14, 2025
This is an alternative solution to the issue described in llvm#167990, which
can be summarized as that we cannot target Python 3.8 with the stable
API and support building for Python 3.13 and later due to the buffer
protocol.

The approach taken in this PR, and proposed by Ismail, is to sidesteps
the issue by dropping support for the buffer protocol. The only two
users are SBFile::Read and SBFile::Write. Instead, we support PyBytes
and PyByteArray which are the builtin types that conform to the buffer
protocol. Technically, this means a small regression, where those
methods could previously take custom types that conform to Python's
buffer protocol. Like Ismail, I think this is acceptable given the
alternatives.
JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Nov 14, 2025
This is an alternative solution to the issue described in llvm#167990, which
can be summarized as that we cannot target Python 3.8 with the stable
API and support building for Python 3.13 and later due to the buffer
protocol.

The approach taken in this PR, and proposed by Ismail, is to sidesteps
the issue by dropping support for the buffer protocol. The only two
users are SBFile::Read and SBFile::Write. Instead, we support PyBytes
and PyByteArray which are the builtin types that conform to the buffer
protocol. Technically, this means a small regression, where those
methods could previously take custom types that conform to Python's
buffer protocol. Like Ismail, I think this is acceptable given the
alternatives.
@medismailben
Copy link
Member

medismailben commented Nov 14, 2025

So you're suggesting that we change SBFile::Read and SBFile::Write to only support PyBytes and PyByteArray and drop support for other types that support the buffer protocol? I think that's also a reasonable path forward.

I think it's more reasonable to do that instead of bumping the minimum version of python to use the limited C API. Users who exercise those APIs with something other than bytes or bytearray can also bridge to/from these types, since they're compatible with the buffer protocol.

JDevlieghere added a commit to JDevlieghere/llvm-project that referenced this pull request Nov 14, 2025
This is an alternative solution to the issue described in llvm#167990, which
can be summarized as that we cannot target Python 3.8 with the stable
API and support building for Python 3.13 and later due to the buffer
protocol.

The approach taken in this PR, and proposed by Ismail, is to sidesteps
the issue by dropping support for the buffer protocol. The only two
users are SBFile::Read and SBFile::Write. Instead, we support PyBytes
and PyByteArray which are the builtin types that conform to the buffer
protocol. Technically, this means a small regression, where those
methods could previously take custom types that conform to Python's
buffer protocol. Like Ismail, I think this is acceptable given the
alternatives.
@JDevlieghere
Copy link
Member Author

Assuming there are no objections, let's go with #168144.

JDevlieghere added a commit that referenced this pull request Nov 15, 2025
This is an alternative solution to the issue described in #167990, which
can be summarized as that we cannot target Python 3.8 with the stable
API and support building for Python 3.13 and later due to the buffer
protocol.

The approach taken in this PR, and proposed by Ismail, is to sidesteps
the issue by dropping support for the buffer protocol. The only two
users are SBFile::Read and SBFile::Write. Instead, we support PyBytes
and PyByteArray which are the builtin types that conform to the buffer
protocol. Technically, this means a small regression, where those
methods could previously take custom types that conform to Python's
buffer protocol. Like Ismail, I think this is acceptable given the
alternatives.

Co-authored-by: Med Ismail Bennani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants