- 
                Notifications
    
You must be signed in to change notification settings  - Fork 1.5k
 
Description
Upon pondering the various rooting options in #11326, and reading through the discussion of API tradeoffs here, it occurs to me that there may be a possible addition to the API, which offers more natural "owned root" semantics in Rust but is still pay-as-you-go, which I'd like to sketch here.
Background and Motivation
Currently, we have two types, Rooted<T> and ManuallyRooted<T>. The former has a LIFO discipline, implicitly attaching to the deepest RootScope in a stack of scopes on the store, and the latter is completely manual. Notably, the latter does not have a Drop impl that actually unroots, because doing so would require holding a reference to the Store somehow -- implicitly with Arcs somehow, because actual borrows of the Store would preclude any other operations.
The LIFO discipline of Rooted works well when it works, i.e., when the user is aware of their scopes, but in this comment I outline a few of the realizations I've had around ergonomics when it combines with other features -- in particular, Rust's ? operator, which pushes users to naturally "propagate errors upward without explicit thought". The usual expectation is that the E type in the Result somehow owns its info. So a dynamic failure (a panic, no less!) when that E crosses over a scope is quite surprising. Exceptional returns are the most obvious example to me right now, but I believe some of the surprise here may also occur with users who naively (but understandably) expect that "GC" means they don't have to worry about lifetimes. Said more succinctly, a type named Rooted in a language like Rust where types often imply ownership might imply that it keeps a root as long as it exists. I know I certainly had that expectation at first. (The docs are very good, but "least surprise" still applies here!)
We provide the "escape hatch", as the docs describe, of ManuallyRooted, and this can certainly work well if the user has extreme discipline -- unfortunately, the requirement of a manual unrooting step means that it is very easy to get wrong, again as the docs describe well.
The middle ground, of a type that somehow unregisters itself on Drop by keeping enough of a handle on the engine's internal state to do so, is described as impractical: it would require an Arc<Mutex<...>> to track internal handle state/registration, as the docs say. This would create synchronization overhead, and potentially pessimize common-case GC operations too. Is there a better way?
Idea: Pay-as-you-Go "Arc'd liveness flags"
Ideally, I want something that:
- Acts as an "owned root", i.e., keeps the referent alive as long as the root type exists and the store exists
 - Has a proper 
Dropsuch that dropping the root type ensures the referent is not leaked(*) - Imposes zero overhead on GC if these roots are not used, and imposes only small overhead proportional to the number of such roots if they are
 
Note the sneaky (*) above: I want to avoid leaks, not to eagerly detect when an unregistration occurs. In other words, let's permit deferring some action from the Drop to, say, the next time a GC runs.
Then to avoid synchronization overhead and contention, rather than a large monolithic registry under a single mutex, let's have some small shared state per owned root.
The idea is: keep a "liveness flag" in an Arc<AtomicBool>. In steady state, when live, there are two references to this Arc: from the owned root type, and from a owned_roots: Vec<(Option<VMGcRef>, Arc<AtomicBool>)> in the GC roots list. When the owned root drops, it sets its atomic bool to false, and drops its Arc reference. When the GC scans roots, it reads out the liveness flags, and removes those roots that the owner has dropped. (E.g. via a retain on the Vec.)1
Two realizations make this more efficient than the Arc<Mutex<whole root list>> approach: (i) we do have a mut borrow to the store when we create, so it's fine to have a normal Vec of roots registered -- only Drop is "remote" without the store; (ii) we have a separate bit of state per root, and it's just an atomic bool, which on common architectures (x86 and Apple Silicon's aarch64 at least) has atomic loads that are exactly as cheap as normal loads.
It's also fully safe Rust (Arc makes it so), and is pay-as-you-go: with no such roots existing, GC root scanning has one check of vec-is-empty and a never-taken branch; as close to zero-overhead as we can make it. There is no mutex contention anywhere, because the only "meeting point" is the Vec that's mutated under a &mut Store. In terms of memory allocation, it's certainly more expensive than a LIFO root (which is just an index into an array!), because there's the separate Arc allocation, but I suspect most uses of these roots are likely to be relatively high-level "entry points" or cases like exception returns where scopes don't map well to usage patterns; we can encourage use of LIFO scopes where possible.
Naming
I've called this an "owned root" and I'd gently suggest considering different names for our current Rooted, to more explicitly describe the difference, if we adopt this -- something like ScopedRooted vs. OwnedRooted?
Footnotes
- 
Slight variant: the tuple above could instead be a single
Arc<OwnedRootData>with liveness and theVMGcRef, and maybe that's cleaner; I haven't thought too much about how this would live alongsideManuallyRootedand whether it would want to share a GC-ref root array somewhere else or not... ↩