Skip to content

Comments

Opaque PyObject ABI support#5807

Draft
ngoldbaum wants to merge 25 commits intoPyO3:mainfrom
ngoldbaum:opaque-pyobject
Draft

Opaque PyObject ABI support#5807
ngoldbaum wants to merge 25 commits intoPyO3:mainfrom
ngoldbaum:opaque-pyobject

Conversation

@ngoldbaum
Copy link
Contributor

@ngoldbaum ngoldbaum commented Feb 13, 2026

Towards fixing #5786.

This is done on top of #5753 because that PR is needed to make this work.

With some help from @davidhewitt, this almost works but subclassing seems to be broken.

---- pycell::tests::test_pyref_as_super stdout ----
[src/pycell.rs:833:13] &pyref.inner = <builtins.SubSubClass object at 0x104ee6940>
[src/pycell.rs:834:13] &pyref.as_super().inner = <builtins.SubSubClass object at 0x104ee6940>

thread 'pycell::tests::test_pyref_as_super' (263860852) panicked at src/pycell.rs:835:13:
assertion `left == right` failed
  left: 20
 right: 10

I suspect using Layout to fill in LayoutAsBase is incorrect here:

type LayoutAsBase = <Self as #pyo3_path::impl_::pyclass::PyClassImpl>::Layout;

But I'm not sure how to update the macros and the macro wrappers to get that to work correctly.

@Icxolu do you happen to have any insight about what's going wrong?

TODO:

  • Hide PyModuldeDef in the FFI bindings.
  • Make tests fully pass

@ngoldbaum
Copy link
Contributor Author

Interesting: the careful tests fail in exactly the same way as the tests fail on the opaque pyobject build. Maybe a useful hint that there's some unsafety happening somewhere...

@ngoldbaum
Copy link
Contributor Author

the careful tests fail in exactly the same way as the tests fail on the opaque pyobject build.

It looks like all the tests on Python 3.12 and newer fail the abi3,full,multiple-pymethods tests. So it looks like the issue is using a PyVarObject in the limited API. Before this PR the base type is still a static PyObject.

As @davidhewitt pointed out when I chatted with him last Friday - it's worth checking to see if doing that introduces performance regressions. If it does, we need to make sure we set things up so a PyVarObject base type is only used on opaque pyobject builds.

codspeed uses the free-threaded build so it's not testing this configuration at all. I need to run benchmarks locally to see what the performance impact is. We also need a benchmark for creating PyClass instances - there is a benchmark for type initialization but not for instance initialization as far as I can see.

@ngoldbaum
Copy link
Contributor Author

ngoldbaum commented Feb 16, 2026

I opened #5816 with a benchmark for creating an instance of a PyClass. I also had to hack in abi3 support for the benchmark crate, see #5818.

Running the bench_pyclass_create benchmark from #5816, I see the following results on Python 3.12.12:

cargo bench --manifest-path=pyo3-benches/Cargo.toml --features=abi3 bench_pyclass_create

With opaque PyObject:
bench_pyclass_create    time:   [42.311 ns 42.374 ns 42.441 ns]

Without:
bench_pyclass_create    time:   [40.705 ns 40.790 ns 40.882 ns]

I ran the benchmarks on my personal development machine so there's some noise in the results. I'm showing the best of four runs of this benchmark for each commit. I see similar results on 3.15a6.

Very roughly, there's a 5% performance cost. That's on the bubble for being significant but probably is too steep to force extensions to pay that cost on the 3.12 stable ABI.

I think I'll go ahead and adjust this PR to only use PyVarObject as a base when it's absolutely needed when _Py_OPAQUE_PYOBJECT is defined.

@ngoldbaum
Copy link
Contributor Author

I have a new theory about why subclassing is broken.

The docs for PyType_Spec::basicsize say:

If negative, the absolute value specifies how much space instances of the class need in addition to the superclass.

But right now PyVariableClassObject<T> (I think?) doesn't include the size of superclasses in its BASIC_SIZE implementation:

pyo3/src/pycell/impl_.rs

Lines 497 to 502 in 46c16e4

const BASIC_SIZE: ffi::Py_ssize_t = {
let size = std::mem::size_of::<PyClassObjectContents<T>>();
assert!(size <= ffi::Py_ssize_t::MAX as usize);
// negative to indicate 'extra' space that cpython will allocate for us
-(size as ffi::Py_ssize_t)
};

I tried to get adjust the basicsize that is passed into CPython inside create_type_object.rs using PyType_GetTypeDataSize, but that doesn't work because we want to get the size of superclass instances, and superclasses aren't guaranteed to have negative basicsizes, which the docs for PyType_GetTypeDataSize say is UB (and in practice causes a seg fault, I've found). I also found a typo in the FFI bindings for this function, which I'll send in a separate fix for.

Maybe PyVariableClassObject<T> really needs to be PyVariableClassObject<T, V> where V is the superclass? Then it would be easy to get the needed information at compile time.

@Icxolu sorry to give you another ping but hopefully this more focused question is a little easier to answer.

@Icxolu
Copy link
Member

Icxolu commented Feb 16, 2026

Yes, the BASIC_SIZE does not include the size of the superclass. But from my understanding, this is exactly what we are supposed to do. This is the size that we need additionally, the size that the base class requires should be handled by the base class itself, no? So in my mind if we have a chain like this:

#[pyclass(extends = PyDict]
struct Sub {
   // sub fields
}

#[pyclass(extends = Sub]
struct SubSub{
   // subsub fields
}

I think the basic size of Sub should include space for "sub fields" and the basic size of SubSub should include "subsub fields" since "sub fields" is already allocated by "Sub" itself.

@ngoldbaum
Copy link
Contributor Author

But from my understanding, this is exactly what we are supposed to do. This is the size that we need additionally, the size that the base class requires should be handled by the base class itself, no?

I guess it depends on what "additionally" means exactly there. I don't see any tests in CPython for setting a negative basicsize on a subtype of another heap type but at least this test only includes the size of the struct and not any supertypes.

Maybe something else is broken?

For whatever reason on this branch on the Python 3.12 limited API, if you try to access data stored on a superclass, you instead access data stored in the subclass. I'd love some help understanding what's broken.

@ngoldbaum
Copy link
Contributor Author

ngoldbaum commented Feb 16, 2026

Maybe the issue is that PyRef and PyRefMut assume that it's safe to cast between a subclass and its parent class, but that's not allowed for relative heap types...

@ngoldbaum
Copy link
Contributor Author

More specifically, isn't this use of NonNull::cast invalid for relative heap types, because BaseType isn't guaranteed to have a compatible layout?

pyo3/src/pycell.rs

Lines 620 to 622 in 46c16e4

let mut ptr = NonNull::from(&mut self.inner)
// `Bound<T>` has the same layout as `Bound<T::BaseType>`
.cast::<Bound<'p, T::BaseType>>()

@davidhewitt
Copy link
Member

To me, it looks like this method on the variable class objects is probably broken in the face of subclassing:

pyo3/src/pycell/impl_.rs

Lines 481 to 486 in fe00ce7

fn get_contents_of_obj(obj: *mut ffi::PyObject) -> *mut PyClassObjectContents<T> {
// https://peps.python.org/pep-0697/
let type_obj = unsafe { ffi::Py_TYPE(obj) };
let pointer = unsafe { ffi::PyObject_GetTypeData(obj, type_obj) };
pointer.cast()
}

The call to Py_TYPE may return the wrong type, instead of T it may return a subtype of T, in which case the call to PyObject_GetTypeData will return a pointer to the wrong memory.

With the below patch I get a failing test on main:

(and if the fields are not both str we get worse memory corruption)

diff --git a/tests/test_inheritance.rs b/tests/test_inheritance.rs
index a7b0734602..4843c67fd9 100644
--- a/tests/test_inheritance.rs
+++ b/tests/test_inheritance.rs
@@ -304,7 +304,7 @@ mod inheriting_native_type {
     #[test]
     #[cfg(Py_3_12)]
     fn inherit_list() {
-        #[pyclass(extends=pyo3::types::PyList)]
+        #[pyclass(extends=pyo3::types::PyList, subclass)]
         struct ListWithName {
             #[pyo3(get)]
             name: &'static str,
@@ -318,12 +318,38 @@ mod inheriting_native_type {
             }
         }

+        #[pyclass(extends=ListWithName)]
+        struct SubListWithName {
+            #[pyo3(get)]
+            sub_name: &'static str,
+        }
+
+        #[pymethods]
+        impl SubListWithName {
+            #[new]
+            fn new() -> PyClassInitializer<Self> {
+                PyClassInitializer::from(ListWithName::new()).add_subclass(Self {
+                    sub_name: "Sublist",
+                })
+            }
+        }
+
         Python::attach(|py| {
-            let list_sub = pyo3::Bound::new(py, ListWithName::new()).unwrap();
+            let list_with_name = pyo3::Bound::new(py, ListWithName::new()).unwrap();
+            let sub_list_with_name = pyo3::Bound::new(py, SubListWithName::new()).unwrap();
             py_run!(
                 py,
-                list_sub,
-                r#"list_sub.append(1); assert list_sub[0] == 1; assert list_sub.name == "Hello :)""#
+                list_with_name sub_list_with_name,
+                r#"
+                    list_with_name.append(1)
+                    assert list_with_name[0] == 1
+                    assert list_with_name.name == "Hello :)", list_with_name.name
+
+                    sub_list_with_name.append(1)
+                    assert sub_list_with_name[0] == 1
+                    assert sub_list_with_name.name == "Hello :)", sub_list_with_name.name
+                    assert sub_list_with_name.sub_name == "Sublist", sub_list_with_name.sub_name
+                "#
             );
         });
     }

... will look into possible fixes for this urgently and seek to release 0.28.2 (probably yanking 0.28.0 and 0.28.1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI-build-full CI-no-fail-fast If one job fails, allow the rest to keep testing free-threading proc-macro

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants