Skip to content

Commit 502fc92

Browse files
authored
docs(gc): document boa_gc API surface used by the engine (#38)
* tests: add coverage for WeakGc constructor (weak.rs) Signed-off-by: mrhapile <allinonegaming3456@gmail.com> * boa_gc API surface Signed-off-by: mrhapile <allinonegaming3456@gmail.com> * remove unrelated WeakGc tests from API surface documentation PR Signed-off-by: mrhapile <allinonegaming3456@gmail.com> * formating Signed-off-by: mrhapile <allinonegaming3456@gmail.com> * docs: add methodology and usage links to boa_gc API surface doc Signed-off-by: mrhapile <allinonegaming3456@gmail.com> --------- Signed-off-by: mrhapile <allinonegaming3456@gmail.com>
1 parent f791b1a commit 502fc92

File tree

1 file changed

+378
-0
lines changed

1 file changed

+378
-0
lines changed

docs/boa_gc_api_surface.md

Lines changed: 378 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,378 @@
1+
# Boa GC API Surface
2+
3+
## 1. Overview
4+
5+
The Boa JavaScript engine depends on the `boa_gc` crate for all garbage-collected memory
6+
management. This document defines the subset of the `boa_gc` public API that the engine
7+
actually uses at compile time and runtime.
8+
9+
Any alternative garbage collector aiming to replace `boa_gc` (for example, Oscars GC)
10+
**must implement this interface** to serve as a drop-in replacement. APIs not listed here
11+
are internal to the collector and are not part of the engine-facing contract.
12+
13+
## Methodology
14+
15+
The API surface documented here was derived by inspecting the current
16+
`boa_gc` usage inside the Boa repository.
17+
18+
The process included:
19+
20+
- Searching for `use boa_gc::` imports across the engine
21+
- Inspecting how core types (`Gc`, `WeakGc`, `GcCell`, `Trace`, `Finalize`)
22+
are used throughout the codebase
23+
- Reviewing `Trace` and `Finalize` implementations to understand the
24+
traversal contract
25+
- Inspecting weak structures (`WeakMap`, `WeakRef`) to identify required
26+
weak reference semantics
27+
28+
This was done through manual inspection of the Boa engine source to
29+
extract the GC interface boundary relied upon by the runtime.
30+
31+
---
32+
33+
## 2. Core Pointer Types
34+
35+
| Type | Role |
36+
|---|---|
37+
| `Gc<T>` | Strong, trace-aware smart pointer. Primary way the engine holds GC-managed values. |
38+
| `WeakGc<T>` | Weak reference that does not prevent collection. Used for caches and JS `WeakRef`. |
39+
| `GcRefCell<T>` | Interior-mutability wrapper for values stored behind a `Gc`. Analogous to `RefCell`. |
40+
| `GcRef<'a, T>` | Immutable borrow guard returned by `GcRefCell::borrow`. |
41+
| `GcRefMut<'a, T>` | Mutable borrow guard returned by `GcRefCell::borrow_mut`. |
42+
43+
These five types appear in virtually every subsystem of the engine: the object model,
44+
environments, bytecode compiler, module system, and builtins.
45+
46+
Example usage in Boa:
47+
48+
- https://github.com/boa-dev/boa/blob/main/core/engine/src/object/jsobject.rs
49+
- https://github.com/boa-dev/boa/blob/main/core/engine/src/value/mod.rs
50+
51+
---
52+
53+
## 3. Pointer Operations
54+
55+
### Allocation
56+
57+
```rust
58+
// Allocate a new GC-managed value.
59+
Gc::new(value: T) -> Gc<T>
60+
61+
// Allocate a value that may reference itself through a weak pointer.
62+
Gc::new_cyclic<F>(data_fn: F) -> Gc<T>
63+
where F: FnOnce(&WeakGc<T>) -> T
64+
```
65+
66+
### Cloning & Identity
67+
68+
```rust
69+
// Duplicate the smart pointer (increments root tracking).
70+
impl Clone for Gc<T>
71+
72+
// Compare two pointers by address, not by value.
73+
Gc::ptr_eq(this: &Gc<T>, other: &Gc<U>) -> bool
74+
```
75+
76+
### Raw Pointer Conversion
77+
78+
Used at FFI boundaries (native function closures, synthetic modules).
79+
80+
```rust
81+
// Consume the Gc and return a raw pointer. Must be paired with from_raw.
82+
Gc::into_raw(this: Gc<T>) -> NonNull<GcBox<T>>
83+
84+
// Reconstruct a Gc from a raw pointer previously obtained via into_raw.
85+
unsafe fn Gc::from_raw(ptr: NonNull<GcBox<T>>) -> Gc<T>
86+
```
87+
88+
### Type Casting
89+
90+
Used by the object model to downcast erased object types.
91+
92+
```rust
93+
// Runtime type check and downcast.
94+
Gc::downcast<U>(this: Gc<T>) -> Option<Gc<U>>
95+
96+
// Unchecked downcast. Caller must guarantee correctness.
97+
unsafe fn Gc::cast_unchecked<U>(this: Gc<T>) -> Gc<U>
98+
99+
// Unchecked reference cast without consuming the pointer.
100+
unsafe fn Gc::cast_ref_unchecked<U>(this: &Gc<T>) -> &Gc<U>
101+
```
102+
103+
### Dereferencing
104+
105+
```rust
106+
impl Deref for Gc<T> { type Target = T; }
107+
```
108+
109+
`Gc<T>` transparently dereferences to `T`, allowing direct field and method access.
110+
111+
---
112+
113+
## 4. Interior Mutability API
114+
115+
### GcRefCell
116+
117+
```rust
118+
GcRefCell::new(value: T) -> GcRefCell<T>
119+
120+
GcRefCell::borrow(&self) -> GcRef<'_, T>
121+
GcRefCell::borrow_mut(&self) -> GcRefMut<'_, T>
122+
GcRefCell::try_borrow(&self) -> Result<GcRef<'_, T>, BorrowError>
123+
GcRefCell::try_borrow_mut(&self) -> Result<GcRefMut<'_, T>, BorrowMutError>
124+
GcRefCell::into_inner(self) -> T
125+
```
126+
127+
### Borrow Guard Mapping
128+
129+
`GcRef` and `GcRefMut` support projecting the borrow into a sub-field of the
130+
contained value, similar to `std::cell::Ref::map`.
131+
132+
```rust
133+
GcRef::map<U>(orig: GcRef<'_, T>, f: F) -> GcRef<'_, U>
134+
GcRef::try_map<U>(orig: GcRef<'_, T>, f: F) -> Result<GcRef<'_, U>, GcRef<'_, T>>
135+
GcRef::cast<U>(orig: GcRef<'_, T>) -> GcRef<'_, U> // unsafe or checked downcast
136+
137+
GcRefMut::map<U>(orig: GcRefMut<'_, T>, f: F) -> GcRefMut<'_, U>
138+
GcRefMut::try_map<U>(orig: GcRefMut<'_, T>, f: F) -> Result<GcRefMut<'_, U>, GcRefMut<'_, T>>
139+
GcRefMut::cast<U>(orig: GcRefMut<'_, T>) -> GcRefMut<'_, U>
140+
```
141+
142+
---
143+
144+
## 5. Weak Reference System
145+
146+
### WeakGc
147+
148+
```rust
149+
// Create a weak reference from a strong pointer.
150+
WeakGc::new(value: &Gc<T>) -> WeakGc<T>
151+
152+
// Attempt to obtain a strong reference. Returns None if the object was collected.
153+
WeakGc::upgrade(&self) -> Option<Gc<T>>
154+
```
155+
156+
A `WeakGc<T>` **must not** prevent its referent from being collected. After collection,
157+
`upgrade()` must return `None`.
158+
159+
### Ephemeron
160+
161+
An `Ephemeron<K, V>` is a key-value pair where the value is only considered reachable
162+
if the key is independently reachable. `WeakGc<T>` is implemented internally as
163+
`Ephemeron<T, ()>`.
164+
165+
### WeakMap
166+
167+
`WeakMap<K, V>` is a GC-aware associative container whose entries are automatically
168+
removed when their key becomes unreachable. It is used by the engine to implement the
169+
ECMAScript `WeakMap` and `WeakSet` builtins.
170+
171+
---
172+
173+
## 6. GC Traits
174+
175+
### Trace
176+
177+
```rust
178+
pub unsafe trait Trace {
179+
/// Mark all GC pointers contained within this value.
180+
unsafe fn trace(&self, tracer: &mut Tracer);
181+
182+
/// Count non-root references during the root-detection phase.
183+
unsafe fn trace_non_roots(&self);
184+
185+
/// Execute the associated finalizer.
186+
fn run_finalizer(&self);
187+
}
188+
```
189+
190+
Every type stored inside a `Gc<T>` must implement `Trace`. The collector calls `trace`
191+
during the mark phase to discover reachable objects.
192+
193+
Example implementations:
194+
195+
- https://github.com/boa-dev/boa/blob/main/core/engine/src/object/jsobject.rs
196+
- https://github.com/boa-dev/boa/blob/main/core/engine/src/context/mod.rs
197+
198+
### Finalize
199+
200+
```rust
201+
pub trait Finalize {
202+
/// Called before the object is reclaimed.
203+
fn finalize(&self);
204+
}
205+
```
206+
207+
`Finalize` provides a cleanup hook that runs **before** memory is freed. Implementations
208+
may perform resource cleanup (e.g., releasing file handles or detaching array buffers).
209+
210+
> **Important:** Finalizers execute before the sweep phase and may resurrect objects by
211+
> storing references back into the live graph. The collector must re-mark the heap after
212+
> finalization to handle this case.
213+
214+
---
215+
216+
## 7. Derive Macros
217+
218+
### Automatic Derivation
219+
220+
```rust
221+
#[derive(Trace, Finalize)]
222+
pub struct MyStruct {
223+
field: Gc<OtherStruct>,
224+
}
225+
```
226+
227+
`#[derive(Trace)]` generates a `trace` implementation that recursively traces each field.
228+
`#[derive(Finalize)]` generates an empty finalizer.
229+
230+
### Manual Tracing Helpers
231+
232+
```rust
233+
// Implement Trace with a custom body.
234+
custom_trace!(this, mark, {
235+
mark(&this.field_a);
236+
mark(&this.field_b);
237+
});
238+
239+
// Implement Trace as a no-op for types containing no GC pointers.
240+
empty_trace!();
241+
242+
// Unsafe variant of empty_trace for foreign types.
243+
unsafe_empty_trace!();
244+
```
245+
246+
---
247+
248+
## 8. Tracing Infrastructure
249+
250+
The `Tracer` type is passed to every `Trace::trace` implementation during the mark phase.
251+
252+
```rust
253+
impl Tracer {
254+
/// Enqueue a GC pointer for marking.
255+
pub fn enqueue(&mut self, ptr: GcErasedPointer);
256+
}
257+
```
258+
259+
The `custom_trace!` macro provides a `mark` closure that calls `tracer.enqueue` internally,
260+
so most code interacts with the tracer indirectly:
261+
262+
```rust
263+
custom_trace!(this, mark, {
264+
mark(&this.some_gc_field);
265+
});
266+
```
267+
268+
The tracer uses an internal work queue to avoid deep recursion when walking large object
269+
graphs.
270+
271+
---
272+
273+
## 9. Weak Collections
274+
275+
### WeakMap
276+
277+
```rust
278+
WeakMap::new() -> WeakMap<K, V>
279+
WeakMap::get(&self, key: &K) -> Option<V>
280+
WeakMap::set(&self, key: &K, value: V)
281+
WeakMap::has(&self, key: &K) -> bool
282+
WeakMap::delete(&self, key: &K) -> bool
283+
WeakMap::get_or_insert(&self, key: &K, value: V) -> V
284+
WeakMap::get_or_insert_computed(&self, key: &K, f: F) -> V
285+
```
286+
287+
### Ephemeron Rule
288+
289+
> A value in an ephemeron pair remains reachable **only if** the key is independently
290+
> reachable through the strong reference graph.
291+
292+
The collector must implement this rule during the mark phase:
293+
1. Mark all strong roots.
294+
2. For each ephemeron, if the key is marked, trace the value.
295+
3. Repeat until no new marks are produced.
296+
4. Remaining ephemerons with unmarked keys are considered dead.
297+
298+
---
299+
300+
## 10. Runtime Utilities
301+
302+
```rust
303+
/// Trigger an immediate garbage collection cycle.
304+
pub fn force_collect();
305+
306+
/// Returns true if it is safe to run finalizers (i.e., the collector is not
307+
/// currently in the sweep/drop phase).
308+
pub fn finalizer_safe() -> bool;
309+
```
310+
311+
`force_collect` is used by tests, the CLI debugger, and indirectly by `WeakRef` deref
312+
semantics. `finalizer_safe` guards against use-after-free during the drop phase.
313+
314+
---
315+
316+
## 11. Allocation Model
317+
318+
All GC allocations go through `Gc::new`:
319+
320+
```rust
321+
let obj = Gc::new(MyValue { ... });
322+
let cell = Gc::new(GcRefCell::new(inner));
323+
```
324+
325+
**No heap handle is passed.** The GC runtime manages heap state internally (typically
326+
via thread-local storage). This means a replacement collector must either:
327+
- use a thread-local or global allocator, **or**
328+
- refactor the engine to pass an explicit context (breaking change).
329+
330+
---
331+
332+
## 12. Minimal Compatibility Contract
333+
334+
### Pointer Types
335+
- `Gc<T>`, `WeakGc<T>`, `GcRefCell<T>`, `GcRef<T>`, `GcRefMut<T>`
336+
337+
### Traits
338+
- `Trace`, `Finalize`
339+
340+
### Derive & Helper Macros
341+
- `#[derive(Trace)]`, `#[derive(Finalize)]`
342+
- `custom_trace!`, `empty_trace!`, `unsafe_empty_trace!`
343+
344+
### Pointer Methods
345+
- `Gc::new`, `Gc::new_cyclic`, `Gc::into_raw`, `Gc::from_raw`
346+
- `Gc::ptr_eq`, `Gc::downcast`, `Gc::cast_unchecked`, `Gc::cast_ref_unchecked`
347+
- `Clone`, `Deref`
348+
349+
### Interior Mutability
350+
- `GcRefCell::new`, `borrow`, `borrow_mut`, `try_borrow`, `try_borrow_mut`, `into_inner`
351+
- `GcRef::map`, `GcRef::try_map`, `GcRef::cast`
352+
- `GcRefMut::map`, `GcRefMut::try_map`, `GcRefMut::cast`
353+
354+
### Weak References
355+
- `WeakGc::new`, `WeakGc::upgrade`
356+
- `WeakMap` with full CRUD API
357+
- Ephemeron semantics
358+
359+
### Runtime Utilities
360+
- `force_collect()`, `finalizer_safe()`
361+
362+
---
363+
364+
## 13. Conclusion
365+
366+
Boa relies on a small but precise garbage collector interface organized around five
367+
concepts:
368+
369+
1. **Strong pointers** (`Gc<T>`) for all heap-allocated engine objects.
370+
2. **Weak references** (`WeakGc<T>`, `Ephemeron`, `WeakMap`) for caches and JS weak
371+
collections.
372+
3. **Interior mutability** (`GcRefCell<T>`) for safe mutation behind shared pointers.
373+
4. **Trait-based tracing** (`Trace`, `Finalize`) for reachability analysis and cleanup.
374+
5. **Macro-generated traversal** (`#[derive(Trace)]`, `custom_trace!`) for ergonomic
375+
integration with 100+ engine types.
376+
377+
Any collector that implements this interface — with stable non-moving pointer identity
378+
and correct ephemeron support — can serve as a drop-in replacement for `boa_gc`.

0 commit comments

Comments
 (0)