Skip to content

Conversation

hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Sep 4, 2025

(Simliar to #2189)
This PR adds HamtImpl::for_each_cacheless and StateTree::for_each_cacheless to reduce memory usage for iterating over a large hamt, to address #2215 (comment)

cargo bench shows no significant difference between for_each and for_each_cacheless

HAMT for_each function  time:   [108.24 µs 108.70 µs 109.19 µs]
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

HAMT for_each_cacheless function
                        time:   [106.85 µs 107.63 µs 108.58 µs]
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

@github-project-automation github-project-automation bot moved this to 📌 Triage in FilOz Sep 4, 2025
@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2025

Codecov Report

❌ Patch coverage is 65.43210% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.50%. Comparing base (fe4d5c1) to head (313aec8).

Files with missing lines Patch % Lines
fvm/src/state_tree.rs 0.00% 10 Missing ⚠️
ipld/hamt/src/node.rs 83.01% 9 Missing ⚠️
ipld/hamt/src/pointer.rs 0.00% 9 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2216      +/-   ##
==========================================
- Coverage   77.56%   77.50%   -0.07%     
==========================================
  Files         147      147              
  Lines       15789    15870      +81     
==========================================
+ Hits        12247    12300      +53     
- Misses       3542     3570      +28     
Files with missing lines Coverage Δ
ipld/hamt/src/hamt.rs 97.27% <100.00%> (+0.11%) ⬆️
ipld/hamt/src/hash_algorithm.rs 73.33% <ø> (ø)
ipld/hamt/src/lib.rs 100.00% <ø> (ø)
ipld/hamt/src/node.rs 90.26% <83.01%> (-1.05%) ⬇️
ipld/hamt/src/pointer.rs 79.06% <0.00%> (-5.94%) ⬇️
fvm/src/state_tree.rs 72.61% <0.00%> (-3.15%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hanabi1224 hanabi1224 marked this pull request as ready for review September 4, 2025 12:31
fn hash<X>(key: &X) -> HashedKey
where
X: Hash,
X: Hash + ?Sized,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix a clippy warning

@hanabi1224 hanabi1224 self-assigned this Sep 5, 2025
@BigLep BigLep moved this from 📌 Triage to 🔎 Awaiting Review in FilOz Sep 8, 2025
Ok(())
}

pub fn for_each_cacheless<F>(&self, mut f: F) -> anyhow::Result<()>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entry?

@LesnyRumcajs
Copy link
Contributor

All in all, LGTM, but I'll let @rvagg have a final say here, my HAMT-fu is not that strong.

@BigLep BigLep requested a review from Copilot September 16, 2025 05:24
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds cacheless iteration functionality to HAMT (Hash Array Mapped Trie) to reduce memory usage when iterating over large HAMTs. The primary addition is for_each_cacheless methods that avoid caching nodes during iteration, which can be more memory-efficient for one-time traversals.

  • Implements for_each_cacheless methods for both HamtImpl and StateTree
  • Adds comprehensive tests for the new cacheless iteration functionality
  • Includes benchmarks comparing performance between cached and cacheless iteration

Reviewed Changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ipld/hamt/tests/hamt_tests.rs Adds comprehensive test suite for for_each_cacheless functionality and refactors existing test infrastructure
ipld/hamt/src/pointer.rs Implements Clone trait for Pointer enum to support cacheless iteration
ipld/hamt/src/node.rs Implements Clone trait for Node and adds core for_each_cacheless iteration logic
ipld/hamt/src/lib.rs Adds Clone trait to KeyValuePair struct to support cloning operations
ipld/hamt/src/hash_algorithm.rs Minor refactor to parameter order for Identity::hash method
ipld/hamt/src/hamt.rs Adds public for_each_cacheless method to the main HAMT API
ipld/hamt/benches/hamt_benchmark.rs Adds benchmark for for_each_cacheless and refactors existing benchmarks
ipld/hamt/Cargo.toml Adds itertools dependency for test utilities
fvm/src/state_tree.rs Adds for_each_cacheless method to StateTree API

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}
IterItem::Borrowed(Pointer::Dirty(node)) => stack.push(node.pointers.iter().into()),
IterItem::Owned(Pointer::Dirty(node)) => {
stack.push(node.pointers.clone().into_iter().into())
Copy link
Preview

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clone() call here creates an unnecessary copy of the entire pointers vector. Consider restructuring to avoid this clone operation, as it defeats the memory efficiency purpose of the cacheless iteration.

Suggested change
stack.push(node.pointers.clone().into_iter().into())
stack.push(node.pointers.into_iter().into())

Copilot uses AI. Check for mistakes.

/// map.set(4, 2).unwrap();
///
/// let mut total = 0;
/// map.for_each_cacheless(|_, v: &u64| {
Copy link
Preview

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type annotation in the example should be &usize to match the generic type used in the example setup, not &u64.

Suggested change
/// map.for_each_cacheless(|_, v: &u64| {
/// map.for_each_cacheless(|_, v: &usize| {

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🔎 Awaiting Review
Development

Successfully merging this pull request may close these issues.

3 participants