Conversation
This comment was marked as outdated.
This comment was marked as outdated.
Code Review: perf: misc micro-optimizationsSummaryThis PR introduces three micro-optimizations to the MPT (Merkle Patricia Trie) implementation. The changes are well-targeted and demonstrate a good understanding of performance hotspots. Overall, this is a solid performance improvement with clean implementation. ✅ Strengths1. Cell vs RefCell Change (trie.rs:63)Excellent optimization. Replacing
The migration is implemented consistently across all usage sites (reference_encode, reference_length, hash, add_node, invalidate_ref_cache). 2. Exact Capacity Allocation (hp.rs:176, 201)Good improvement. The calculation
This eliminates over-allocation from the previous fixed 33-byte capacity. 3. Skip Nibble Decode on Hot Paths (trie.rs:653, 699, 819, 826)Smart optimization. The new helper functions
The implementations in hp.rs (77-166) are careful and correct, handling both odd and even-length encoded paths. 🔍 Observations & Minor Concerns1. Code Duplication in Cache Access PatternThe cache lookup pattern is repeated 3 times (reference_encode, reference_length, hash): let cached = self.cached_references[node_id as usize].get();
let node_ref = match cached {
Some(node_ref) => node_ref,
None => {
let node_ref = self.calc_reference(node_id);
self.cached_references[node_id as usize].set(Some(node_ref));
node_ref
}
};Suggestion: Consider extracting this into a helper method to reduce duplication: #[inline]
fn get_or_calc_reference(&self, node_id: NodeId) -> NodeRef<'a> {
self.cached_references[node_id as usize]
.get()
.unwrap_or_else(|| {
let node_ref = self.calc_reference(node_id);
self.cached_references[node_id as usize].set(Some(node_ref));
node_ref
})
}This would simplify the three call sites and make future changes easier. 2. Bounds Checking in Hot PathThe array indexing Suggestion (optional): If profiling shows bounds checking overhead, consider using 3. nibble/nibs Naming InconsistencyThe code uses both "nibble count" and "nib count" terminology (e.g., 🧪 TestingGood coverage: The existing tests in hp.rs (test_encoded_path_nibble_count, test_eq_and_strip_prefix) cover the new helper functions. The high-level MPT tests should catch any regressions in the insert/delete hot paths. Recommendation: Verify that the test suite runs successfully and covers edge cases like:
🔒 SecurityNo security concerns identified. The changes:
📊 PerformanceThe benchmark results look promising:
The parallel improvement is more significant, suggesting the optimizations help with concurrent workloads where cache/allocation contention matters more. 📝 Minor Nits
Final Verdict✅ LGTM with minor suggestions This is a well-executed micro-optimization PR. The changes are:
The only improvement would be reducing the code duplication in the cache access pattern, but that's a minor refactoring opportunity rather than a blocker. Recommendation: Approve and merge. Consider the helper method extraction as a follow-up cleanup if desired. |
shayanh
left a comment
There was a problem hiding this comment.
LGTM. I'm not if the second optimization is necessarily better. Maybe better to do 1+3 or only 1.
| let encoded_len = 1 + (nibs.len() / 2); | ||
| let mut encoded = Vec::with_capacity(encoded_len); |
There was a problem hiding this comment.
I'm not sure if this is necessarily better.
There was a problem hiding this comment.
Well bench is improved slightly. It's n=1 but it's better than nothing. Given that it's a 1 line change I am keeping it
RefCell-- slightly uglier but faster. I tried it previous and it did not work, now it works, not sure what was the issue. This is the main optimizationBefore, insert/delete would always call
prefix_to_nibsto expand the compact path into aSmallVec, even when the key matched or the prefix was clearly a match. That allocation/expansion is now avoided on the hot path:Bench links:
Key Observations