Skip to content

perf: parallelize HashedPostStateSorted::from_reverts hashing/sorting #20049

@yongkangc

Description

@yongkangc

Summary

Explore a perf experiment for HashedPostStateSorted::from_reverts to parallelize the hashing/sorting step (not the DB cursor walks) so we can speed up large revert ranges without extra allocations.

Motivation

Proposal

  • Keep DB cursor iteration sequential (required), but parallelize the post-collection hashing/sorting with rayon behind an opt-in feature flag (e.g. parallel-from-reverts).
  • Use thresholds to avoid rayon overhead on small inputs (e.g. accounts > 1k and per-account slots > 32 use par_sort_unstable_by_key, else stay sequential).
  • Hash keys during collection to avoid an extra map pass if it helps CPU profiles.

Acceptance Criteria

  • Implement gated parallel path with sensible thresholds and benchmarks comparing sequential vs parallel on large synthetic revert sets (e.g. 10k+ accounts, skewed slot distributions).
  • Measure CPU and wall-time deltas; document when parallel path wins/loses.
  • Keep default behavior unchanged (feature flag off) to avoid footprint changes.

Risks / Notes

  • rayon dependency and build footprint (hence feature flag).
  • Parallel sorting only helps when key counts are large; overhead may regress small cases.

Metadata

Metadata

Assignees

Labels

A-trieRelated to Merkle Patricia Trie implementationC-perfA change motivated by improving speed, memory usage or disk footprintD-good-first-issueNice and easy! A great choice to get startedS-needs-benchmarkThis set of changes needs performance benchmarking to double-check that they help

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions