Skip to content

Commit cc93e03

Browse files
authored
weak_map: replace entries in one lookup and research ephemron vs key/value pair for insert (#50)
* weak_map: replace entries in one lookup and research ephemeron vs key/value pair approach for insert * weak_map: replace entries in one lookup and research ephemeron vs key/value pair approach for insert
1 parent ee925a1 commit cc93e03

File tree

4 files changed

+117
-55
lines changed

4 files changed

+117
-55
lines changed

notes/2026-02-26.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# WeakMap Support
22

3+
Revision: [2026-03-15](2026-03-15.md)
4+
35
Note author: shruti2522
46

57
This document summarizes all the changes made to add weak map support to the mark sweep collector.
@@ -88,4 +90,4 @@ or write `unsafe` code, and it plugs right into the collector's existing trace
8890
and sweep phases.
8991

9092
In the future, the best improvement would be switching from `HashMap` to `HashTable`
91-
to save memory. Until then, this first version works well and gives us the weak map behavior we need
93+
to save memory. Until then, this first version works well and gives us the weak map behavior we need.

notes/2026-03-15.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# WeakMap Support Follow up
2+
3+
Note author: shruti2522
4+
5+
Revision of: [2026-02-26](2026-02-26.md)
6+
7+
This document is a follow up to the original weak map support note.
8+
9+
## what changed since the original note
10+
11+
### `WeakMap<K, V>` representation: `HashTable` over `HashMap`
12+
13+
The original note identified using `HashTable` instead of `HashMap` as a
14+
potential improvement. We implemented this in the move away from
15+
`HashMap<usize, ArenaPointer<Ephemeron<K, V>>>` to a
16+
`HashTable<(usize, EphemeronPtr)>` model. The address and ephemeron pointer are
17+
now stored inline as a tuple in the table entry, which saves the cost of a
18+
separate key allocation and improves memory locality
19+
20+
### ephemeron vs key/value pair: why we kept the key/value approach
21+
22+
One research question was whether `insert` should take an `Ephemeron` directly
23+
instead of a `(key, value)` pair. The answer is, it should not. Here's why:
24+
25+
**`Ephemeron` is a GC internal.** The collector owns the lifecycle
26+
of ephemerons. They get allocated, traced during mark phase, checked during sweep
27+
for reachability and finalized when their keys die.
28+
29+
If `WeakMap::insert` exposed `Ephemeron` in its public API, we would leak GC
30+
internals into the user facing weak map interface. Instead, `insert` takes a
31+
`(key, value)` pair, which is the right boundary. The ephemeron allocation and
32+
queue registration happen internally via `collector.alloc_ephemeron_node`
33+
34+
This keeps the weak map simple for users while hiding the complex GC details.
35+
36+
### `replace_or_insert`: one lookup instead of two
37+
38+
Previously, `insert` did a two step update: remove any old ephemeron, then
39+
insert the new one. This meant two lookups in the map and two queue operations.
40+
41+
Now, `replace_or_insert` does both operations in a single `HashTable::find_entry`
42+
lookup:
43+
44+
- If an entry exists for the address, swap the new ephemeron in and invalidate the old one
45+
- If no entry exists, insert the new entry
46+
47+
This is faster and also makes sure the old ephemeron is cleaned up before the
48+
new one takes its place.
49+
50+
### how the collector manages weak maps
51+
52+
The collector owns all weak maps internally. `WeakMap` is just a handle pointing
53+
to memory the collector owns. During cleanup, the collector prunes dead entries
54+
from weak maps after marking dead objects but before freeing their memory. This
55+
order matters because we need to read status bits on the ephemerons to decide
56+
which entries to keep. If we freed the memory first, those bits would be gone.
57+
58+
## conclusion
59+
60+
`WeakMap::insert` should look simple to users
61+
(just key and value), while all the ephemeron management stays hidden inside
62+
the collector.
63+
64+
Changes made since the original note:
65+
66+
1. Switch to `HashTable` to store key and pointer together, saving memory
67+
2. Use `replace_or_insert` for faster updates (one lookup instead of two)
68+
3. Confirm that ephemerons should never appear in the user facing `WeakMap` API

oscars/src/collectors/mark_sweep/pointers/weak_map.rs

Lines changed: 23 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -40,26 +40,25 @@ impl<K: Trace, V: Trace> WeakMapInner<K, V> {
4040
}
4141
}
4242

43-
fn remove_and_invalidate(&mut self, key_addr: usize) {
44-
if let Ok(entry) = self
45-
.entries
46-
.find_entry(hash_addr(key_addr), |e| e.0 == key_addr)
47-
{
48-
let ((_, old_ephemeron), _) = entry.remove();
49-
old_ephemeron.as_inner_ref().invalidate();
50-
}
51-
}
52-
53-
fn insert_ptr(
43+
// insert an entry, returns the old ephemeron pointer if one was replaced, None otherwise
44+
fn insert(
5445
&mut self,
5546
key_addr: usize,
56-
ephemeron_ptr: PoolPointer<'static, Ephemeron<K, V>>,
57-
) {
58-
// caller guarantees no duplicate exists since remove_and_invalidate was called first
59-
self.entries
60-
.insert_unique(hash_addr(key_addr), (key_addr, ephemeron_ptr), |e| {
61-
hash_addr(e.0)
62-
});
47+
new_ptr: PoolPointer<'static, Ephemeron<K, V>>,
48+
) -> Option<PoolPointer<'static, Ephemeron<K, V>>> {
49+
let hash = hash_addr(key_addr);
50+
match self.entries.find_entry(hash, |e| e.0 == key_addr) {
51+
Ok(mut entry) => {
52+
// swap without probing again, caller is responsible for invalidating
53+
let old = core::mem::replace(entry.get_mut(), (key_addr, new_ptr));
54+
Some(old.1)
55+
}
56+
Err(_absent) => {
57+
self.entries
58+
.insert_unique(hash, (key_addr, new_ptr), |e| hash_addr(e.0));
59+
None
60+
}
61+
}
6362
}
6463

6564
fn get(&self, key: &Gc<K>) -> Option<&V> {
@@ -132,24 +131,21 @@ impl<K: Trace, V: Trace> WeakMap<K, V> {
132131
Self { inner }
133132
}
134133

134+
// insert a value for `key`, replacing and invalidating any old ephemeron
135135
pub fn insert<C: Collector>(&mut self, key: &Gc<K>, value: V, collector: &C) {
136136
let key_addr = key.inner_ptr.as_non_null().as_ptr() as usize;
137137

138-
// remove and invalidate any existing ephemeron for this key
139-
// SAFETY: we have unique access to `self`
140-
unsafe { self.inner.as_mut().remove_and_invalidate(key_addr) };
141-
142-
//allocate the new ephemeron node
143138
let ephemeron_ptr = collector
144139
.alloc_ephemeron_node(key, value)
145140
.expect("Failed to allocate ephemeron");
146141

147-
// SAFETY: safe because the gc tracks this
142+
// SAFETY: the collector keeps the pool alive for the map lifetime
148143
let ephemeron_ptr = unsafe { ephemeron_ptr.extend_lifetime() };
149144

150-
//insert the new node using another short lived mutable borrow
151-
// SAFETY: we have unique access to `self`
152-
unsafe { self.inner.as_mut().insert_ptr(key_addr, ephemeron_ptr) };
145+
// SAFETY: `&mut self` gives exclusive access to `inner`
146+
if let Some(old) = unsafe { self.inner.as_mut().insert(key_addr, ephemeron_ptr) } {
147+
old.as_inner_ref().invalidate();
148+
}
153149
}
154150

155151
pub fn get(&self, key: &Gc<K>) -> Option<&V> {

oscars/src/collectors/mark_sweep_arena2/pointers/weak_map.rs

Lines changed: 23 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -39,26 +39,25 @@ impl<K: Trace, V: Trace> WeakMapInner<K, V> {
3939
}
4040
}
4141

42-
fn remove_and_invalidate(&mut self, key_addr: usize) {
43-
if let Ok(entry) = self
44-
.entries
45-
.find_entry(hash_addr(key_addr), |e| e.0 == key_addr)
46-
{
47-
let ((_, old_ephemeron), _) = entry.remove();
48-
old_ephemeron.as_inner_ref().invalidate();
49-
}
50-
}
51-
52-
fn insert_ptr(
42+
// insert an entry, returns the old ephemeron pointer if one was replaced, None otherwise
43+
fn insert(
5344
&mut self,
5445
key_addr: usize,
55-
ephemeron_ptr: ArenaPointer<'static, Ephemeron<K, V>>,
56-
) {
57-
// caller guarantees no duplicate exists since remove_and_invalidate was called first
58-
self.entries
59-
.insert_unique(hash_addr(key_addr), (key_addr, ephemeron_ptr), |e| {
60-
hash_addr(e.0)
61-
});
46+
new_ptr: ArenaPointer<'static, Ephemeron<K, V>>,
47+
) -> Option<ArenaPointer<'static, Ephemeron<K, V>>> {
48+
let hash = hash_addr(key_addr);
49+
match self.entries.find_entry(hash, |e| e.0 == key_addr) {
50+
Ok(mut entry) => {
51+
// swap without probing again, caller is responsible for invalidating
52+
let old = core::mem::replace(entry.get_mut(), (key_addr, new_ptr));
53+
Some(old.1)
54+
}
55+
Err(_absent) => {
56+
self.entries
57+
.insert_unique(hash, (key_addr, new_ptr), |e| hash_addr(e.0));
58+
None
59+
}
60+
}
6261
}
6362

6463
fn get(&self, key: &Gc<K>) -> Option<&V> {
@@ -133,6 +132,7 @@ impl<K: Trace, V: Trace> WeakMap<K, V> {
133132
Self { inner }
134133
}
135134

135+
// insert a value for `key`, replacing and invalidating any old ephemeron
136136
pub fn insert(
137137
&mut self,
138138
key: &Gc<K>,
@@ -141,22 +141,18 @@ impl<K: Trace, V: Trace> WeakMap<K, V> {
141141
) {
142142
let key_addr = key.inner_ptr.as_non_null().as_ptr() as usize;
143143

144-
// remove and invalidate any existing ephemeron for this key
145-
// SAFETY: we have unique access to `self`
146-
unsafe { self.inner.as_mut().remove_and_invalidate(key_addr) };
147-
148-
//allocate the new ephemeron node
149144
let ephemeron_ptr = collector
150145
.alloc_ephemeron_node(key, value)
151146
.expect("Failed to allocate ephemeron");
152147

153-
// SAFETY: safe because the gc tracks this
148+
// SAFETY: the collector keeps the pool alive for the map lifetime
154149
let ephemeron_ptr: ArenaPointer<'static, Ephemeron<K, V>> =
155150
unsafe { ephemeron_ptr.extend_lifetime() };
156151

157-
//insert the new node using another short lived mutable borrow
158-
// SAFETY: we have unique access to `self`
159-
unsafe { self.inner.as_mut().insert_ptr(key_addr, ephemeron_ptr) };
152+
// SAFETY: `&mut self` gives exclusive access to `inner`
153+
if let Some(old) = unsafe { self.inner.as_mut().insert(key_addr, ephemeron_ptr) } {
154+
old.as_inner_ref().invalidate();
155+
}
160156
}
161157

162158
pub fn get(&self, key: &Gc<K>) -> Option<&V> {

0 commit comments

Comments
 (0)