-
Notifications
You must be signed in to change notification settings - Fork 14
docs(gc): document boa_gc API surface used by the engine #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
ae00a45
c949288
2bcba8b
b03f203
42b6995
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,350 @@ | ||
| # Boa GC API Surface | ||
|
|
||
| ## 1. Overview | ||
|
|
||
| The Boa JavaScript engine depends on the `boa_gc` crate for all garbage-collected memory | ||
| management. This document defines the subset of the `boa_gc` public API that the engine | ||
| actually uses at compile time and runtime. | ||
|
|
||
| Any alternative garbage collector aiming to replace `boa_gc` (for example, Oscars GC) | ||
| **must implement this interface** to serve as a drop-in replacement. APIs not listed here | ||
| are internal to the collector and are not part of the engine-facing contract. | ||
|
|
||
| --- | ||
|
|
||
| ## 2. Core Pointer Types | ||
|
|
||
| | Type | Role | | ||
| |---|---| | ||
| | `Gc<T>` | Strong, trace-aware smart pointer. Primary way the engine holds GC-managed values. | | ||
| | `WeakGc<T>` | Weak reference that does not prevent collection. Used for caches and JS `WeakRef`. | | ||
| | `GcRefCell<T>` | Interior-mutability wrapper for values stored behind a `Gc`. Analogous to `RefCell`. | | ||
| | `GcRef<'a, T>` | Immutable borrow guard returned by `GcRefCell::borrow`. | | ||
| | `GcRefMut<'a, T>` | Mutable borrow guard returned by `GcRefCell::borrow_mut`. | | ||
|
|
||
| These five types appear in virtually every subsystem of the engine: the object model, | ||
| environments, bytecode compiler, module system, and builtins. | ||
|
|
||
| --- | ||
|
|
||
| ## 3. Pointer Operations | ||
|
|
||
| ### Allocation | ||
|
|
||
| ```rust | ||
| // Allocate a new GC-managed value. | ||
| Gc::new(value: T) -> Gc<T> | ||
|
|
||
| // Allocate a value that may reference itself through a weak pointer. | ||
| Gc::new_cyclic<F>(data_fn: F) -> Gc<T> | ||
| where F: FnOnce(&WeakGc<T>) -> T | ||
mrhapile marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| ### Cloning & Identity | ||
|
|
||
| ```rust | ||
| // Duplicate the smart pointer (increments root tracking). | ||
| impl Clone for Gc<T> | ||
|
|
||
| // Compare two pointers by address, not by value. | ||
| Gc::ptr_eq(this: &Gc<T>, other: &Gc<U>) -> bool | ||
| ``` | ||
|
|
||
| ### Raw Pointer Conversion | ||
|
|
||
| Used at FFI boundaries (native function closures, synthetic modules). | ||
|
|
||
| ```rust | ||
| // Consume the Gc and return a raw pointer. Must be paired with from_raw. | ||
| Gc::into_raw(this: Gc<T>) -> NonNull<GcBox<T>> | ||
|
|
||
| // Reconstruct a Gc from a raw pointer previously obtained via into_raw. | ||
| unsafe fn Gc::from_raw(ptr: NonNull<GcBox<T>>) -> Gc<T> | ||
| ``` | ||
|
|
||
| ### Type Casting | ||
|
|
||
| Used by the object model to downcast erased object types. | ||
|
|
||
| ```rust | ||
| // Runtime type check and downcast. | ||
| Gc::downcast<U>(this: Gc<T>) -> Option<Gc<U>> | ||
|
|
||
| // Unchecked downcast. Caller must guarantee correctness. | ||
| unsafe fn Gc::cast_unchecked<U>(this: Gc<T>) -> Gc<U> | ||
|
|
||
| // Unchecked reference cast without consuming the pointer. | ||
| unsafe fn Gc::cast_ref_unchecked<U>(this: &Gc<T>) -> &Gc<U> | ||
| ``` | ||
|
|
||
| ### Dereferencing | ||
|
|
||
| ```rust | ||
| impl Deref for Gc<T> { type Target = T; } | ||
| ``` | ||
|
|
||
| `Gc<T>` transparently dereferences to `T`, allowing direct field and method access. | ||
|
|
||
| --- | ||
|
|
||
| ## 4. Interior Mutability API | ||
|
|
||
| ### GcRefCell | ||
|
|
||
| ```rust | ||
| GcRefCell::new(value: T) -> GcRefCell<T> | ||
|
|
||
| GcRefCell::borrow(&self) -> GcRef<'_, T> | ||
| GcRefCell::borrow_mut(&self) -> GcRefMut<'_, T> | ||
| GcRefCell::try_borrow(&self) -> Result<GcRef<'_, T>, BorrowError> | ||
| GcRefCell::try_borrow_mut(&self) -> Result<GcRefMut<'_, T>, BorrowMutError> | ||
| GcRefCell::into_inner(self) -> T | ||
| ``` | ||
|
|
||
| ### Borrow Guard Mapping | ||
|
|
||
| `GcRef` and `GcRefMut` support projecting the borrow into a sub-field of the | ||
| contained value, similar to `std::cell::Ref::map`. | ||
|
|
||
| ```rust | ||
| GcRef::map<U>(orig: GcRef<'_, T>, f: F) -> GcRef<'_, U> | ||
| GcRef::try_map<U>(orig: GcRef<'_, T>, f: F) -> Result<GcRef<'_, U>, GcRef<'_, T>> | ||
| GcRef::cast<U>(orig: GcRef<'_, T>) -> GcRef<'_, U> // unsafe or checked downcast | ||
|
|
||
| GcRefMut::map<U>(orig: GcRefMut<'_, T>, f: F) -> GcRefMut<'_, U> | ||
| GcRefMut::try_map<U>(orig: GcRefMut<'_, T>, f: F) -> Result<GcRefMut<'_, U>, GcRefMut<'_, T>> | ||
| GcRefMut::cast<U>(orig: GcRefMut<'_, T>) -> GcRefMut<'_, U> | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 5. Weak Reference System | ||
|
|
||
| ### WeakGc | ||
|
|
||
| ```rust | ||
| // Create a weak reference from a strong pointer. | ||
| WeakGc::new(value: &Gc<T>) -> WeakGc<T> | ||
|
|
||
| // Attempt to obtain a strong reference. Returns None if the object was collected. | ||
| WeakGc::upgrade(&self) -> Option<Gc<T>> | ||
| ``` | ||
|
|
||
| A `WeakGc<T>` **must not** prevent its referent from being collected. After collection, | ||
| `upgrade()` must return `None`. | ||
|
|
||
| ### Ephemeron | ||
|
|
||
| An `Ephemeron<K, V>` is a key-value pair where the value is only considered reachable | ||
| if the key is independently reachable. `WeakGc<T>` is implemented internally as | ||
| `Ephemeron<T, ()>`. | ||
|
|
||
| ### WeakMap | ||
|
|
||
| `WeakMap<K, V>` is a GC-aware associative container whose entries are automatically | ||
| removed when their key becomes unreachable. It is used by the engine to implement the | ||
| ECMAScript `WeakMap` and `WeakSet` builtins. | ||
|
|
||
| --- | ||
|
|
||
| ## 6. GC Traits | ||
|
|
||
| ### Trace | ||
|
|
||
| ```rust | ||
| pub unsafe trait Trace { | ||
| /// Mark all GC pointers contained within this value. | ||
| unsafe fn trace(&self, tracer: &mut Tracer); | ||
|
|
||
| /// Count non-root references during the root-detection phase. | ||
| unsafe fn trace_non_roots(&self); | ||
|
|
||
| /// Execute the associated finalizer. | ||
| fn run_finalizer(&self); | ||
| } | ||
| ``` | ||
|
|
||
| Every type stored inside a `Gc<T>` must implement `Trace`. The collector calls `trace` | ||
| during the mark phase to discover reachable objects. | ||
|
|
||
| ### Finalize | ||
|
|
||
| ```rust | ||
| pub trait Finalize { | ||
| /// Called before the object is reclaimed. | ||
| fn finalize(&self); | ||
| } | ||
| ``` | ||
|
|
||
| `Finalize` provides a cleanup hook that runs **before** memory is freed. Implementations | ||
| may perform resource cleanup (e.g., releasing file handles or detaching array buffers). | ||
|
|
||
| > **Important:** Finalizers execute before the sweep phase and may resurrect objects by | ||
| > storing references back into the live graph. The collector must re-mark the heap after | ||
| > finalization to handle this case. | ||
|
|
||
| --- | ||
|
|
||
| ## 7. Derive Macros | ||
|
|
||
| ### Automatic Derivation | ||
|
|
||
| ```rust | ||
| #[derive(Trace, Finalize)] | ||
| pub struct MyStruct { | ||
| field: Gc<OtherStruct>, | ||
| } | ||
| ``` | ||
|
|
||
| `#[derive(Trace)]` generates a `trace` implementation that recursively traces each field. | ||
| `#[derive(Finalize)]` generates an empty finalizer. | ||
|
|
||
| ### Manual Tracing Helpers | ||
|
|
||
| ```rust | ||
| // Implement Trace with a custom body. | ||
| custom_trace!(this, mark, { | ||
| mark(&this.field_a); | ||
| mark(&this.field_b); | ||
| }); | ||
|
|
||
| // Implement Trace as a no-op for types containing no GC pointers. | ||
| empty_trace!(); | ||
|
|
||
| // Unsafe variant of empty_trace for foreign types. | ||
| unsafe_empty_trace!(); | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 8. Tracing Infrastructure | ||
|
|
||
| The `Tracer` type is passed to every `Trace::trace` implementation during the mark phase. | ||
|
|
||
| ```rust | ||
| impl Tracer { | ||
| /// Enqueue a GC pointer for marking. | ||
| pub fn enqueue(&mut self, ptr: GcErasedPointer); | ||
| } | ||
| ``` | ||
|
|
||
| The `custom_trace!` macro provides a `mark` closure that calls `tracer.enqueue` internally, | ||
| so most code interacts with the tracer indirectly: | ||
|
|
||
| ```rust | ||
| custom_trace!(this, mark, { | ||
| mark(&this.some_gc_field); | ||
| }); | ||
| ``` | ||
|
|
||
| The tracer uses an internal work queue to avoid deep recursion when walking large object | ||
| graphs. | ||
|
|
||
| --- | ||
|
|
||
| ## 9. Weak Collections | ||
|
|
||
| ### WeakMap | ||
|
|
||
| ```rust | ||
| WeakMap::new() -> WeakMap<K, V> | ||
| WeakMap::get(&self, key: &K) -> Option<V> | ||
| WeakMap::set(&self, key: &K, value: V) | ||
| WeakMap::has(&self, key: &K) -> bool | ||
| WeakMap::delete(&self, key: &K) -> bool | ||
| WeakMap::get_or_insert(&self, key: &K, value: V) -> V | ||
| WeakMap::get_or_insert_computed(&self, key: &K, f: F) -> V | ||
| ``` | ||
|
|
||
| ### Ephemeron Rule | ||
|
|
||
| > A value in an ephemeron pair remains reachable **only if** the key is independently | ||
| > reachable through the strong reference graph. | ||
|
|
||
| The collector must implement this rule during the mark phase: | ||
| 1. Mark all strong roots. | ||
| 2. For each ephemeron, if the key is marked, trace the value. | ||
| 3. Repeat until no new marks are produced. | ||
| 4. Remaining ephemerons with unmarked keys are considered dead. | ||
|
|
||
| --- | ||
|
|
||
| ## 10. Runtime Utilities | ||
|
|
||
| ```rust | ||
| /// Trigger an immediate garbage collection cycle. | ||
| pub fn force_collect(); | ||
|
|
||
| /// Returns true if it is safe to run finalizers (i.e., the collector is not | ||
| /// currently in the sweep/drop phase). | ||
| pub fn finalizer_safe() -> bool; | ||
| ``` | ||
|
|
||
| `force_collect` is used by tests, the CLI debugger, and indirectly by `WeakRef` deref | ||
| semantics. `finalizer_safe` guards against use-after-free during the drop phase. | ||
|
|
||
| --- | ||
|
|
||
| ## 11. Allocation Model | ||
|
|
||
| All GC allocations go through `Gc::new`: | ||
|
|
||
| ```rust | ||
| let obj = Gc::new(MyValue { ... }); | ||
| let cell = Gc::new(GcRefCell::new(inner)); | ||
| ``` | ||
|
|
||
| **No heap handle is passed.** The GC runtime manages heap state internally (typically | ||
| via thread-local storage). This means a replacement collector must either: | ||
| - use a thread-local or global allocator, **or** | ||
| - refactor the engine to pass an explicit context (breaking change). | ||
|
|
||
| --- | ||
|
|
||
| ## 12. Minimal Compatibility Contract | ||
|
|
||
| ### Pointer Types | ||
| - `Gc<T>`, `WeakGc<T>`, `GcRefCell<T>`, `GcRef<T>`, `GcRefMut<T>` | ||
|
|
||
| ### Traits | ||
| - `Trace`, `Finalize` | ||
|
|
||
| ### Derive & Helper Macros | ||
| - `#[derive(Trace)]`, `#[derive(Finalize)]` | ||
| - `custom_trace!`, `empty_trace!`, `unsafe_empty_trace!` | ||
|
|
||
| ### Pointer Methods | ||
| - `Gc::new`, `Gc::new_cyclic`, `Gc::into_raw`, `Gc::from_raw` | ||
| - `Gc::ptr_eq`, `Gc::downcast`, `Gc::cast_unchecked`, `Gc::cast_ref_unchecked` | ||
| - `Clone`, `Deref` | ||
|
|
||
| ### Interior Mutability | ||
| - `GcRefCell::new`, `borrow`, `borrow_mut`, `try_borrow`, `try_borrow_mut`, `into_inner` | ||
| - `GcRef::map`, `GcRef::try_map`, `GcRef::cast` | ||
| - `GcRefMut::map`, `GcRefMut::try_map`, `GcRefMut::cast` | ||
|
|
||
| ### Weak References | ||
| - `WeakGc::new`, `WeakGc::upgrade` | ||
| - `WeakMap` with full CRUD API | ||
| - Ephemeron semantics | ||
|
|
||
| ### Runtime Utilities | ||
| - `force_collect()`, `finalizer_safe()` | ||
|
|
||
| --- | ||
|
|
||
| ## 13. Conclusion | ||
|
|
||
| Boa relies on a small but precise garbage collector interface organized around five | ||
| concepts: | ||
|
|
||
| 1. **Strong pointers** (`Gc<T>`) for all heap-allocated engine objects. | ||
| 2. **Weak references** (`WeakGc<T>`, `Ephemeron`, `WeakMap`) for caches and JS weak | ||
| collections. | ||
| 3. **Interior mutability** (`GcRefCell<T>`) for safe mutation behind shared pointers. | ||
| 4. **Trait-based tracing** (`Trace`, `Finalize`) for reachability analysis and cleanup. | ||
| 5. **Macro-generated traversal** (`#[derive(Trace)]`, `custom_trace!`) for ergonomic | ||
| integration with 100+ engine types. | ||
|
|
||
| Any collector that implements this interface — with stable non-moving pointer identity | ||
| and correct ephemeron support — can serve as a drop-in replacement for `boa_gc`. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -627,3 +627,21 @@ mod gc_edge_cases { | |
| collector.collect(); | ||
| } | ||
| } | ||
|
|
||
| /// Tests for the standalone WeakGc pointer behavior. | ||
| mod weak_gc_tests { | ||
|
||
| use crate::collectors::mark_sweep::MarkSweepGarbageCollector; | ||
| use crate::collectors::mark_sweep::cell::GcRefCell; | ||
| use crate::collectors::mark_sweep::pointers::{Gc, WeakGc}; | ||
|
|
||
| #[test] | ||
| fn weak_gc_can_be_created_from_strong() { | ||
| let collector = &mut MarkSweepGarbageCollector::default() | ||
| .with_arena_size(256) | ||
| .with_heap_threshold(512); | ||
|
|
||
| let strong = Gc::new_in(GcRefCell::new(42u64), collector); | ||
|
|
||
| let _weak = WeakGc::new_in(&strong, collector); | ||
| } | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.