Skip to content

Commit e280841

Browse files
committed
Improve the C API memory management documentation
1 parent b42bbea commit e280841

File tree

1 file changed

+30
-23
lines changed

1 file changed

+30
-23
lines changed

docs/contributor/IMPLEMENTATION_DETAILS.md

Lines changed: 30 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -96,9 +96,9 @@ files to get an idea of how rules are set for patches to be applied.
9696

9797
We always run with a GIL, because C extensions in CPython expect to do so and
9898
are usually not written to be reentrant. The reason to always have the GIL
99-
enabled is that when using Python, at least Sulong/LLVM is always available in
100-
the same context and we cannot know if someone may be using that (or another
101-
polyglot language or the Java host interop) to start additional threads that
99+
enabled is that when using Python, another polyglot language or the Java host
100+
interop can be available in the same context, and we cannot know if someone
101+
may be using that to start additional threads that
102102
could call back into Python. This could legitimately happen in C extensions when
103103
the C extension authors use knowledge of how CPython works to do something
104104
GIL-less in a C thread that is fine to do on CPython's data structures, but not
@@ -171,7 +171,8 @@ safepoint action mechanism can thus be used to kill threads waiting on the GIL.
171171
### High-level
172172

173173
C extensions assume reference counting, but on the managed side we want to leverage
174-
Java tracing GC. This creates a mismatch. The approach is to combine the two.
174+
Java tracing GC. This creates a mismatch. The approach is to do both, reference
175+
counting and tracing GC, at the same time.
175176

176177
On the native side we use reference counting. The native code is responsible for doing
177178
the counting, i.e., calling the `Py_IncRef` and `Py_DecRef` API functions. Inside those
@@ -193,12 +194,12 @@ There are two kinds of Python objects in GraalPy: managed and native.
193194

194195
Managed objects are allocated in the interpreter. If there is no native code involved,
195196
we do not do anything special and let the Java GC handle them. When a managed object
196-
leaks to native extension:
197+
is passed to a native extension code:
197198

198199
* We wrap it in `PythonObjectNativeWrapper`. This is mostly in order to provide different
199200
interop protocol: we do not want to expose `toNative` and `asPointer` on Python objects.
200201

201-
* When NFI or Sulong call `toNative`/`asPointer` we:
202+
* When NFI calls `toNative`/`asPointer` we:
202203
* Allocate C memory that will represent the object on the native side (including the refcount field)
203204
* Add a mapping of that memory address to the `PythonObjectNativeWrapper` object to a hash map `CApiTransitions.nativeLookup`.
204205
* We initialize the refcount field to a constant `MANAGED_REFCNT` (larger number, because some
@@ -216,38 +217,44 @@ as long as there are some native references. We set a field `PythonObjectReferen
216217
which will keep the `PythonObjectNativeWrapper` alive even when all other managed references die.
217218

218219
* When extension code is done with the object, it will call `Py_DecRef`.
219-
In the C implementation of `Py_DecRef` we check if a managed object with refcount==MANAGED_REFCNT+1
220+
In the C implementation of `Py_DecRef` we check if a managed object with `refcount == MANAGED_REFCNT+1`
220221
wants to decrement its refcount to MANAGED_REFCNT, which means that there are no native references
221222
to that object anymore. In such case we clear the `PythonObjectReference.strongReference` field,
222223
and the memory management is then again left solely to the Java tracing GC.
223224

224225
#### Native Objects
225226

226-
Native objects are backed by native memory and may never leak to managed code. If they do not
227-
leak to managed code, they are reference counted as usual, where `Py_DecRef` call that reaches
228-
`0` will deallocate the object. If a native object does leak to managed code:
227+
Native objects allocated using `PyObject_GC_New` in the native code are backed by native memory
228+
and may never be passed to managed code (as a return value of extension function or as an argument
229+
to some C API call). If a native object is not made available to managed code, it is just reference
230+
counted as usual, where `Py_DecRef` call that reaches `0` will deallocate the object. If a native
231+
object is passed to managed code:
229232

230233
* We increment the refcount of the native object by `MANAGED_REFCNT`
231234
* We create:
232-
* `PythonAbstractNativeObject` Java object to represent it
235+
* `PythonAbstractNativeObject` Java object to mirror it on the managed side
233236
* `NativeObjectReference`, a weak reference to the `PythonAbstractNativeObject`.
234-
* Save the mapping from the native object address to the `NativeObjectReference`
235-
object into hash map `CApiTransitions.nativeLookup` (next time this native object leaks to
236-
the managed code, we only fetch the existing wrapper and don't do any of this).
237+
* Add mapping: native object address => `NativeObjectReference` into hash map `CApiTransitions.nativeLookup`
238+
* Next time we just fetch the existing wrapper and don't do any of this
237239
* When `NativeObjectReference` is enqueued, we decrement the refcount by `MANAGED_REFCNT`
238-
and if it falls to `0`, it means that there are no references to the object even from
239-
native code, we can destroy it. If it does not fall to `0`, we just wait for the native
240-
code to eventually call `Py_DecRef` that makes it fall to `0`.
240+
* If the refcount falls to `0`, it means that there are no references to the object even from
241+
native code, and we can destroy it. If it does not fall to `0`, we just wait for the native
242+
code to eventually call `Py_DecRef` that makes it fall to `0`.
243+
244+
#### Weak References
245+
246+
TODO
241247

242248
### Cycle GC
243249

244250
We leverage the CPython's GC module to detect cycles for objects that participate
245-
in the reference counting scheme (native objects or managed objects that leaked to native).
251+
in the reference counting scheme (native objects or managed objects that got passed
252+
to native code).
246253
See: https://devguide.python.org/internals/garbage-collector/index.html.
247254

248255
There are two issues:
249256

250-
* Objects that are referenced from the managed code have refcount >= `MANAGED_REFCNT` and
257+
* Objects that are referenced from the managed code have `refcount >= MANAGED_REFCNT` and
251258
until Java GC runs we do not know if they are garbage or not.
252259
* We cannot traverse the managed objects: since we don't do refcounting on the managed
253260
side, we cannot traverse them and decrement refcounts to see if there is a cycle.
@@ -272,13 +279,13 @@ count them into this limit. Let us call this limit *weak to strong limit*.
272279
After this, if the managed objects are garbage, eventually Java GC will collect them
273280
together with the whole cycle.
274281

275-
If some of the managed objects are not garbage, and they leak back to native code,
282+
If some of the managed objects are not garbage, and they passed back to native code,
276283
the native code can then access and resurrect the whole cycle. W.r.t. the refcounts
277284
integrity this is fine, because we did not alter the refcounts. The native references
278285
between the objects are still factored in their refcounts. What may seem like a problem
279-
is that we pushed the *weak to strong limit* for some objects. Such object may leak to
280-
native, get `Py_IncRef`'ed making it strong reference again. Since `Py_DecRef` is
286+
is that we pushed the *weak to strong limit* for some objects. Such an object may be
287+
passed to native, get `Py_IncRef`'ed making it strong reference again. Since `Py_DecRef` is
281288
checking the same `MANAGED_REFCNT` limit for all objects, the subsequent `Py_DecRef`
282289
call for this object will not detect that the reference should be made weak again!
283290
However, this is OK, it only prolongs the collection: we will make it weak again in
284-
the next run of the cycle GC.
291+
the next run of the cycle GC on the native side.

0 commit comments

Comments
 (0)