-
Notifications
You must be signed in to change notification settings - Fork 146
feat: cacheless hamt iteration #2216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: cacheless hamt iteration #2216
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2216 +/- ##
==========================================
- Coverage 77.56% 77.50% -0.07%
==========================================
Files 147 147
Lines 15789 15870 +81
==========================================
+ Hits 12247 12300 +53
- Misses 3542 3570 +28
🚀 New features to boost your workflow:
|
fn hash<X>(key: &X) -> HashedKey | ||
where | ||
X: Hash, | ||
X: Hash + ?Sized, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix a clippy warning
Ok(()) | ||
} | ||
|
||
pub fn for_each_cacheless<F>(&self, mut f: F) -> anyhow::Result<()> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changelog entry?
All in all, LGTM, but I'll let @rvagg have a final say here, my HAMT-fu is not that strong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds cacheless iteration functionality to HAMT (Hash Array Mapped Trie) to reduce memory usage when iterating over large HAMTs. The primary addition is for_each_cacheless
methods that avoid caching nodes during iteration, which can be more memory-efficient for one-time traversals.
- Implements
for_each_cacheless
methods for bothHamtImpl
andStateTree
- Adds comprehensive tests for the new cacheless iteration functionality
- Includes benchmarks comparing performance between cached and cacheless iteration
Reviewed Changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
ipld/hamt/tests/hamt_tests.rs | Adds comprehensive test suite for for_each_cacheless functionality and refactors existing test infrastructure |
ipld/hamt/src/pointer.rs | Implements Clone trait for Pointer enum to support cacheless iteration |
ipld/hamt/src/node.rs | Implements Clone trait for Node and adds core for_each_cacheless iteration logic |
ipld/hamt/src/lib.rs | Adds Clone trait to KeyValuePair struct to support cloning operations |
ipld/hamt/src/hash_algorithm.rs | Minor refactor to parameter order for Identity::hash method |
ipld/hamt/src/hamt.rs | Adds public for_each_cacheless method to the main HAMT API |
ipld/hamt/benches/hamt_benchmark.rs | Adds benchmark for for_each_cacheless and refactors existing benchmarks |
ipld/hamt/Cargo.toml | Adds itertools dependency for test utilities |
fvm/src/state_tree.rs | Adds for_each_cacheless method to StateTree API |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
} | ||
IterItem::Borrowed(Pointer::Dirty(node)) => stack.push(node.pointers.iter().into()), | ||
IterItem::Owned(Pointer::Dirty(node)) => { | ||
stack.push(node.pointers.clone().into_iter().into()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The clone()
call here creates an unnecessary copy of the entire pointers vector. Consider restructuring to avoid this clone operation, as it defeats the memory efficiency purpose of the cacheless iteration.
stack.push(node.pointers.clone().into_iter().into()) | |
stack.push(node.pointers.into_iter().into()) |
Copilot uses AI. Check for mistakes.
/// map.set(4, 2).unwrap(); | ||
/// | ||
/// let mut total = 0; | ||
/// map.for_each_cacheless(|_, v: &u64| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type annotation in the example should be &usize
to match the generic type used in the example setup, not &u64
.
/// map.for_each_cacheless(|_, v: &u64| { | |
/// map.for_each_cacheless(|_, v: &usize| { |
Copilot uses AI. Check for mistakes.
(Simliar to #2189)
This PR adds
HamtImpl::for_each_cacheless
andStateTree::for_each_cacheless
to reduce memory usage for iterating over a large hamt, to address #2215 (comment)cargo bench
shows no significant difference betweenfor_each
andfor_each_cacheless