|
| 1 | +# Boa GC API Surface |
| 2 | + |
| 3 | +## 1. Overview |
| 4 | + |
| 5 | +The Boa JavaScript engine depends on the `boa_gc` crate for all garbage-collected memory |
| 6 | +management. This document defines the subset of the `boa_gc` public API that the engine |
| 7 | +actually uses at compile time and runtime. |
| 8 | + |
| 9 | +Any alternative garbage collector aiming to replace `boa_gc` (for example, Oscars GC) |
| 10 | +**must implement this interface** to serve as a drop-in replacement. APIs not listed here |
| 11 | +are internal to the collector and are not part of the engine-facing contract. |
| 12 | + |
| 13 | +## Methodology |
| 14 | + |
| 15 | +The API surface documented here was derived by inspecting the current |
| 16 | +`boa_gc` usage inside the Boa repository. |
| 17 | + |
| 18 | +The process included: |
| 19 | + |
| 20 | +- Searching for `use boa_gc::` imports across the engine |
| 21 | +- Inspecting how core types (`Gc`, `WeakGc`, `GcCell`, `Trace`, `Finalize`) |
| 22 | + are used throughout the codebase |
| 23 | +- Reviewing `Trace` and `Finalize` implementations to understand the |
| 24 | + traversal contract |
| 25 | +- Inspecting weak structures (`WeakMap`, `WeakRef`) to identify required |
| 26 | + weak reference semantics |
| 27 | + |
| 28 | +This was done through manual inspection of the Boa engine source to |
| 29 | +extract the GC interface boundary relied upon by the runtime. |
| 30 | + |
| 31 | +--- |
| 32 | + |
| 33 | +## 2. Core Pointer Types |
| 34 | + |
| 35 | +| Type | Role | |
| 36 | +|---|---| |
| 37 | +| `Gc<T>` | Strong, trace-aware smart pointer. Primary way the engine holds GC-managed values. | |
| 38 | +| `WeakGc<T>` | Weak reference that does not prevent collection. Used for caches and JS `WeakRef`. | |
| 39 | +| `GcRefCell<T>` | Interior-mutability wrapper for values stored behind a `Gc`. Analogous to `RefCell`. | |
| 40 | +| `GcRef<'a, T>` | Immutable borrow guard returned by `GcRefCell::borrow`. | |
| 41 | +| `GcRefMut<'a, T>` | Mutable borrow guard returned by `GcRefCell::borrow_mut`. | |
| 42 | + |
| 43 | +These five types appear in virtually every subsystem of the engine: the object model, |
| 44 | +environments, bytecode compiler, module system, and builtins. |
| 45 | + |
| 46 | +Example usage in Boa: |
| 47 | + |
| 48 | +- https://github.com/boa-dev/boa/blob/main/core/engine/src/object/jsobject.rs |
| 49 | +- https://github.com/boa-dev/boa/blob/main/core/engine/src/value/mod.rs |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | +## 3. Pointer Operations |
| 54 | + |
| 55 | +### Allocation |
| 56 | + |
| 57 | +```rust |
| 58 | +// Allocate a new GC-managed value. |
| 59 | +Gc::new(value: T) -> Gc<T> |
| 60 | + |
| 61 | +// Allocate a value that may reference itself through a weak pointer. |
| 62 | +Gc::new_cyclic<F>(data_fn: F) -> Gc<T> |
| 63 | +where F: FnOnce(&WeakGc<T>) -> T |
| 64 | +``` |
| 65 | + |
| 66 | +### Cloning & Identity |
| 67 | + |
| 68 | +```rust |
| 69 | +// Duplicate the smart pointer (increments root tracking). |
| 70 | +impl Clone for Gc<T> |
| 71 | + |
| 72 | +// Compare two pointers by address, not by value. |
| 73 | +Gc::ptr_eq(this: &Gc<T>, other: &Gc<U>) -> bool |
| 74 | +``` |
| 75 | + |
| 76 | +### Raw Pointer Conversion |
| 77 | + |
| 78 | +Used at FFI boundaries (native function closures, synthetic modules). |
| 79 | + |
| 80 | +```rust |
| 81 | +// Consume the Gc and return a raw pointer. Must be paired with from_raw. |
| 82 | +Gc::into_raw(this: Gc<T>) -> NonNull<GcBox<T>> |
| 83 | + |
| 84 | +// Reconstruct a Gc from a raw pointer previously obtained via into_raw. |
| 85 | +unsafe fn Gc::from_raw(ptr: NonNull<GcBox<T>>) -> Gc<T> |
| 86 | +``` |
| 87 | + |
| 88 | +### Type Casting |
| 89 | + |
| 90 | +Used by the object model to downcast erased object types. |
| 91 | + |
| 92 | +```rust |
| 93 | +// Runtime type check and downcast. |
| 94 | +Gc::downcast<U>(this: Gc<T>) -> Option<Gc<U>> |
| 95 | + |
| 96 | +// Unchecked downcast. Caller must guarantee correctness. |
| 97 | +unsafe fn Gc::cast_unchecked<U>(this: Gc<T>) -> Gc<U> |
| 98 | + |
| 99 | +// Unchecked reference cast without consuming the pointer. |
| 100 | +unsafe fn Gc::cast_ref_unchecked<U>(this: &Gc<T>) -> &Gc<U> |
| 101 | +``` |
| 102 | + |
| 103 | +### Dereferencing |
| 104 | + |
| 105 | +```rust |
| 106 | +impl Deref for Gc<T> { type Target = T; } |
| 107 | +``` |
| 108 | + |
| 109 | +`Gc<T>` transparently dereferences to `T`, allowing direct field and method access. |
| 110 | + |
| 111 | +--- |
| 112 | + |
| 113 | +## 4. Interior Mutability API |
| 114 | + |
| 115 | +### GcRefCell |
| 116 | + |
| 117 | +```rust |
| 118 | +GcRefCell::new(value: T) -> GcRefCell<T> |
| 119 | + |
| 120 | +GcRefCell::borrow(&self) -> GcRef<'_, T> |
| 121 | +GcRefCell::borrow_mut(&self) -> GcRefMut<'_, T> |
| 122 | +GcRefCell::try_borrow(&self) -> Result<GcRef<'_, T>, BorrowError> |
| 123 | +GcRefCell::try_borrow_mut(&self) -> Result<GcRefMut<'_, T>, BorrowMutError> |
| 124 | +GcRefCell::into_inner(self) -> T |
| 125 | +``` |
| 126 | + |
| 127 | +### Borrow Guard Mapping |
| 128 | + |
| 129 | +`GcRef` and `GcRefMut` support projecting the borrow into a sub-field of the |
| 130 | +contained value, similar to `std::cell::Ref::map`. |
| 131 | + |
| 132 | +```rust |
| 133 | +GcRef::map<U>(orig: GcRef<'_, T>, f: F) -> GcRef<'_, U> |
| 134 | +GcRef::try_map<U>(orig: GcRef<'_, T>, f: F) -> Result<GcRef<'_, U>, GcRef<'_, T>> |
| 135 | +GcRef::cast<U>(orig: GcRef<'_, T>) -> GcRef<'_, U> // unsafe or checked downcast |
| 136 | + |
| 137 | +GcRefMut::map<U>(orig: GcRefMut<'_, T>, f: F) -> GcRefMut<'_, U> |
| 138 | +GcRefMut::try_map<U>(orig: GcRefMut<'_, T>, f: F) -> Result<GcRefMut<'_, U>, GcRefMut<'_, T>> |
| 139 | +GcRefMut::cast<U>(orig: GcRefMut<'_, T>) -> GcRefMut<'_, U> |
| 140 | +``` |
| 141 | + |
| 142 | +--- |
| 143 | + |
| 144 | +## 5. Weak Reference System |
| 145 | + |
| 146 | +### WeakGc |
| 147 | + |
| 148 | +```rust |
| 149 | +// Create a weak reference from a strong pointer. |
| 150 | +WeakGc::new(value: &Gc<T>) -> WeakGc<T> |
| 151 | + |
| 152 | +// Attempt to obtain a strong reference. Returns None if the object was collected. |
| 153 | +WeakGc::upgrade(&self) -> Option<Gc<T>> |
| 154 | +``` |
| 155 | + |
| 156 | +A `WeakGc<T>` **must not** prevent its referent from being collected. After collection, |
| 157 | +`upgrade()` must return `None`. |
| 158 | + |
| 159 | +### Ephemeron |
| 160 | + |
| 161 | +An `Ephemeron<K, V>` is a key-value pair where the value is only considered reachable |
| 162 | +if the key is independently reachable. `WeakGc<T>` is implemented internally as |
| 163 | +`Ephemeron<T, ()>`. |
| 164 | + |
| 165 | +### WeakMap |
| 166 | + |
| 167 | +`WeakMap<K, V>` is a GC-aware associative container whose entries are automatically |
| 168 | +removed when their key becomes unreachable. It is used by the engine to implement the |
| 169 | +ECMAScript `WeakMap` and `WeakSet` builtins. |
| 170 | + |
| 171 | +--- |
| 172 | + |
| 173 | +## 6. GC Traits |
| 174 | + |
| 175 | +### Trace |
| 176 | + |
| 177 | +```rust |
| 178 | +pub unsafe trait Trace { |
| 179 | + /// Mark all GC pointers contained within this value. |
| 180 | + unsafe fn trace(&self, tracer: &mut Tracer); |
| 181 | + |
| 182 | + /// Count non-root references during the root-detection phase. |
| 183 | + unsafe fn trace_non_roots(&self); |
| 184 | + |
| 185 | + /// Execute the associated finalizer. |
| 186 | + fn run_finalizer(&self); |
| 187 | +} |
| 188 | +``` |
| 189 | + |
| 190 | +Every type stored inside a `Gc<T>` must implement `Trace`. The collector calls `trace` |
| 191 | +during the mark phase to discover reachable objects. |
| 192 | + |
| 193 | +Example implementations: |
| 194 | + |
| 195 | +- https://github.com/boa-dev/boa/blob/main/core/engine/src/object/jsobject.rs |
| 196 | +- https://github.com/boa-dev/boa/blob/main/core/engine/src/context/mod.rs |
| 197 | + |
| 198 | +### Finalize |
| 199 | + |
| 200 | +```rust |
| 201 | +pub trait Finalize { |
| 202 | + /// Called before the object is reclaimed. |
| 203 | + fn finalize(&self); |
| 204 | +} |
| 205 | +``` |
| 206 | + |
| 207 | +`Finalize` provides a cleanup hook that runs **before** memory is freed. Implementations |
| 208 | +may perform resource cleanup (e.g., releasing file handles or detaching array buffers). |
| 209 | + |
| 210 | +> **Important:** Finalizers execute before the sweep phase and may resurrect objects by |
| 211 | +> storing references back into the live graph. The collector must re-mark the heap after |
| 212 | +> finalization to handle this case. |
| 213 | +
|
| 214 | +--- |
| 215 | + |
| 216 | +## 7. Derive Macros |
| 217 | + |
| 218 | +### Automatic Derivation |
| 219 | + |
| 220 | +```rust |
| 221 | +#[derive(Trace, Finalize)] |
| 222 | +pub struct MyStruct { |
| 223 | + field: Gc<OtherStruct>, |
| 224 | +} |
| 225 | +``` |
| 226 | + |
| 227 | +`#[derive(Trace)]` generates a `trace` implementation that recursively traces each field. |
| 228 | +`#[derive(Finalize)]` generates an empty finalizer. |
| 229 | + |
| 230 | +### Manual Tracing Helpers |
| 231 | + |
| 232 | +```rust |
| 233 | +// Implement Trace with a custom body. |
| 234 | +custom_trace!(this, mark, { |
| 235 | + mark(&this.field_a); |
| 236 | + mark(&this.field_b); |
| 237 | +}); |
| 238 | + |
| 239 | +// Implement Trace as a no-op for types containing no GC pointers. |
| 240 | +empty_trace!(); |
| 241 | + |
| 242 | +// Unsafe variant of empty_trace for foreign types. |
| 243 | +unsafe_empty_trace!(); |
| 244 | +``` |
| 245 | + |
| 246 | +--- |
| 247 | + |
| 248 | +## 8. Tracing Infrastructure |
| 249 | + |
| 250 | +The `Tracer` type is passed to every `Trace::trace` implementation during the mark phase. |
| 251 | + |
| 252 | +```rust |
| 253 | +impl Tracer { |
| 254 | + /// Enqueue a GC pointer for marking. |
| 255 | + pub fn enqueue(&mut self, ptr: GcErasedPointer); |
| 256 | +} |
| 257 | +``` |
| 258 | + |
| 259 | +The `custom_trace!` macro provides a `mark` closure that calls `tracer.enqueue` internally, |
| 260 | +so most code interacts with the tracer indirectly: |
| 261 | + |
| 262 | +```rust |
| 263 | +custom_trace!(this, mark, { |
| 264 | + mark(&this.some_gc_field); |
| 265 | +}); |
| 266 | +``` |
| 267 | + |
| 268 | +The tracer uses an internal work queue to avoid deep recursion when walking large object |
| 269 | +graphs. |
| 270 | + |
| 271 | +--- |
| 272 | + |
| 273 | +## 9. Weak Collections |
| 274 | + |
| 275 | +### WeakMap |
| 276 | + |
| 277 | +```rust |
| 278 | +WeakMap::new() -> WeakMap<K, V> |
| 279 | +WeakMap::get(&self, key: &K) -> Option<V> |
| 280 | +WeakMap::set(&self, key: &K, value: V) |
| 281 | +WeakMap::has(&self, key: &K) -> bool |
| 282 | +WeakMap::delete(&self, key: &K) -> bool |
| 283 | +WeakMap::get_or_insert(&self, key: &K, value: V) -> V |
| 284 | +WeakMap::get_or_insert_computed(&self, key: &K, f: F) -> V |
| 285 | +``` |
| 286 | + |
| 287 | +### Ephemeron Rule |
| 288 | + |
| 289 | +> A value in an ephemeron pair remains reachable **only if** the key is independently |
| 290 | +> reachable through the strong reference graph. |
| 291 | +
|
| 292 | +The collector must implement this rule during the mark phase: |
| 293 | +1. Mark all strong roots. |
| 294 | +2. For each ephemeron, if the key is marked, trace the value. |
| 295 | +3. Repeat until no new marks are produced. |
| 296 | +4. Remaining ephemerons with unmarked keys are considered dead. |
| 297 | + |
| 298 | +--- |
| 299 | + |
| 300 | +## 10. Runtime Utilities |
| 301 | + |
| 302 | +```rust |
| 303 | +/// Trigger an immediate garbage collection cycle. |
| 304 | +pub fn force_collect(); |
| 305 | + |
| 306 | +/// Returns true if it is safe to run finalizers (i.e., the collector is not |
| 307 | +/// currently in the sweep/drop phase). |
| 308 | +pub fn finalizer_safe() -> bool; |
| 309 | +``` |
| 310 | + |
| 311 | +`force_collect` is used by tests, the CLI debugger, and indirectly by `WeakRef` deref |
| 312 | +semantics. `finalizer_safe` guards against use-after-free during the drop phase. |
| 313 | + |
| 314 | +--- |
| 315 | + |
| 316 | +## 11. Allocation Model |
| 317 | + |
| 318 | +All GC allocations go through `Gc::new`: |
| 319 | + |
| 320 | +```rust |
| 321 | +let obj = Gc::new(MyValue { ... }); |
| 322 | +let cell = Gc::new(GcRefCell::new(inner)); |
| 323 | +``` |
| 324 | + |
| 325 | +**No heap handle is passed.** The GC runtime manages heap state internally (typically |
| 326 | +via thread-local storage). This means a replacement collector must either: |
| 327 | +- use a thread-local or global allocator, **or** |
| 328 | +- refactor the engine to pass an explicit context (breaking change). |
| 329 | + |
| 330 | +--- |
| 331 | + |
| 332 | +## 12. Minimal Compatibility Contract |
| 333 | + |
| 334 | +### Pointer Types |
| 335 | +- `Gc<T>`, `WeakGc<T>`, `GcRefCell<T>`, `GcRef<T>`, `GcRefMut<T>` |
| 336 | + |
| 337 | +### Traits |
| 338 | +- `Trace`, `Finalize` |
| 339 | + |
| 340 | +### Derive & Helper Macros |
| 341 | +- `#[derive(Trace)]`, `#[derive(Finalize)]` |
| 342 | +- `custom_trace!`, `empty_trace!`, `unsafe_empty_trace!` |
| 343 | + |
| 344 | +### Pointer Methods |
| 345 | +- `Gc::new`, `Gc::new_cyclic`, `Gc::into_raw`, `Gc::from_raw` |
| 346 | +- `Gc::ptr_eq`, `Gc::downcast`, `Gc::cast_unchecked`, `Gc::cast_ref_unchecked` |
| 347 | +- `Clone`, `Deref` |
| 348 | + |
| 349 | +### Interior Mutability |
| 350 | +- `GcRefCell::new`, `borrow`, `borrow_mut`, `try_borrow`, `try_borrow_mut`, `into_inner` |
| 351 | +- `GcRef::map`, `GcRef::try_map`, `GcRef::cast` |
| 352 | +- `GcRefMut::map`, `GcRefMut::try_map`, `GcRefMut::cast` |
| 353 | + |
| 354 | +### Weak References |
| 355 | +- `WeakGc::new`, `WeakGc::upgrade` |
| 356 | +- `WeakMap` with full CRUD API |
| 357 | +- Ephemeron semantics |
| 358 | + |
| 359 | +### Runtime Utilities |
| 360 | +- `force_collect()`, `finalizer_safe()` |
| 361 | + |
| 362 | +--- |
| 363 | + |
| 364 | +## 13. Conclusion |
| 365 | + |
| 366 | +Boa relies on a small but precise garbage collector interface organized around five |
| 367 | +concepts: |
| 368 | + |
| 369 | +1. **Strong pointers** (`Gc<T>`) for all heap-allocated engine objects. |
| 370 | +2. **Weak references** (`WeakGc<T>`, `Ephemeron`, `WeakMap`) for caches and JS weak |
| 371 | + collections. |
| 372 | +3. **Interior mutability** (`GcRefCell<T>`) for safe mutation behind shared pointers. |
| 373 | +4. **Trait-based tracing** (`Trace`, `Finalize`) for reachability analysis and cleanup. |
| 374 | +5. **Macro-generated traversal** (`#[derive(Trace)]`, `custom_trace!`) for ergonomic |
| 375 | + integration with 100+ engine types. |
| 376 | + |
| 377 | +Any collector that implements this interface — with stable non-moving pointer identity |
| 378 | +and correct ephemeron support — can serve as a drop-in replacement for `boa_gc`. |
0 commit comments