perf: pre-allocate hash maps #732

samueltardieu · 2026-01-02T16:19:01Z

Summary by CodeRabbit

Refactor
- Improved internal data structure initialization and pre-sizing to reduce memory reallocations and improve performance during component processing, with no changes to public behavior or interfaces.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-02T16:19:10Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

The PR switches to FxHashMap with an explicit FxBuildHasher and precomputes capacity estimates in separate_components and components, initializing hash maps with those capacities to reduce reallocations; no public signatures changed.

Changes

Cohort / File(s)	Summary
Connected components optimizations `src/undirected/connected_components.rs`	Add `FxBuildHasher` import. In `separate_components`, sum element counts to precompute `estimated_capacity` and initialize the indices map with that capacity. In `components`, compute `estimated_capacity` from `gindices` (excluding `usize::MAX`) and construct an `FxHashMap` with explicit capacity and hasher instead of default.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰
I nibbled bytes and counted seeds,
Pre-sized baskets meet our needs.
Fx hashes, snug and keen,
Fewer hops, the graph runs clean.
A tiny rabbit's optimization scene!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'perf: pre-allocate hash maps' directly and clearly describes the main change: optimizing performance by pre-allocating hash maps in the connected_components.rs file.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e76809c and 80556b3.

📒 Files selected for processing (1)

src/undirected/connected_components.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/undirected/connected_components.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: benchmarks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codspeed-hq · 2026-01-02T16:21:23Z

CodSpeed Performance Report

Merging #732 will improve performance by ×2

_{Comparing push-zuzvxtrytpyn (80556b3) with main (d5aceb5)}

Summary

⚡ 1 improvement
✅ 37 untouched

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`separate_components`	2.2 ms	1.1 ms	×2

Copilot

Pull request overview

This PR introduces performance optimizations by pre-allocating hash maps with estimated capacities to reduce reallocations during insertions. The changes focus on the separate_components and components functions in the undirected connected components module.

Pre-allocates HashMap in separate_components based on total element count across all groups
Pre-allocates FxHashMap in components based on the count of non-empty components

Copilot · 2026-01-02T16:21:51Z

src/undirected/connected_components.rs

+        // Pre-size the hash map to reduce reallocations
+        let estimated_capacity = groups.iter().map(|g| g.into_iter().count()).sum();
+        let mut indices = HashMap::with_capacity(estimated_capacity);


The capacity estimation here iterates through all groups twice - once to count the elements for pre-allocation (line 64) and again in the main loop (line 66). The into_iter().count() call on line 64 will consume or clone each iterator, potentially causing performance overhead that negates the benefit of pre-allocation.

Consider storing the group sizes during the first pass or accepting that the pre-allocation might not be worth the extra iteration cost in all cases. Alternatively, if groups are cheap to iterate, document that this optimization assumes iteration is inexpensive.

Suggested change

// Pre-size the hash map to reduce reallocations

let estimated_capacity = groups.iter().map(|g| g.into_iter().count()).sum();

let mut indices = HashMap::with_capacity(estimated_capacity);

let mut indices = HashMap::new();

Copilot · 2026-01-02T16:21:52Z

src/undirected/connected_components.rs

+        // Pre-size the hash map to reduce reallocations
+        let estimated_capacity = gindices.iter().filter(|&&n| n != usize::MAX).count();
+        let mut gb: FxHashMap<usize, FxHashSet<N>> =
+            FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher);


The FxHashMap type is a type alias that already includes FxBuildHasher as its default hasher, so you can simply use FxHashMap::with_capacity(estimated_capacity) instead of the more verbose FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher). This follows the same pattern as the standard HashMap::with_capacity() used in the separate_components function above.

Suggested change

FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher);

FxHashMap::with_capacity(estimated_capacity);

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/undirected/connected_components.rs (1)
108-111: Simplify to use with_capacity instead of with_capacity_and_hasher.

Since FxHashMap is a type alias for HashMap<K, V, FxBuildHasher>, calling with_capacity_and_hasher with an explicit FxBuildHasher is redundant. The simpler with_capacity method already uses the correct hasher.
🔎 Proposed simplification
-        let mut gb: FxHashMap<usize, FxHashSet<N>> =
-            FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher);
+        let mut gb: FxHashMap<usize, FxHashSet<N>> =
+            FxHashMap::with_capacity(estimated_capacity);

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5aceb5 and e76809c.

📒 Files selected for processing (1)

src/undirected/connected_components.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Test suite (msrv)
GitHub Check: Test suite (stable)
GitHub Check: Test suite (beta)
GitHub Check: Test suite (nightly)
GitHub Check: Extra tests in release mode
GitHub Check: Test with minimal versions
GitHub Check: Agent
GitHub Check: benchmarks

🔇 Additional comments (2)

src/undirected/connected_components.rs (2)

8-8: LGTM!

The FxBuildHasher import is necessary for the explicit hasher configuration on line 111.

63-65: Verify that pre-allocation benefits outweigh double iteration cost.

The optimization iterates through all groups twice: once to count elements (line 64) and again to process them (lines 66-79). While pre-allocation reduces reallocations, the upfront counting pass adds O(n) overhead.

Please confirm with benchmarks that this optimization improves performance for the typical use cases, especially when groups are small or sparse.

samueltardieu requested a review from Copilot January 2, 2026 16:19

Copilot started reviewing on behalf of samueltardieu January 2, 2026 16:19 View session

Copilot AI reviewed Jan 2, 2026

View reviewed changes

coderabbitai bot reviewed Jan 2, 2026

View reviewed changes

perf: pre-allocate hash maps

80556b3

samueltardieu force-pushed the push-zuzvxtrytpyn branch from e76809c to 80556b3 Compare January 2, 2026 16:31

samueltardieu added this pull request to the merge queue Jan 3, 2026

Merged via the queue into main with commit 857616f Jan 3, 2026
14 checks passed

samueltardieu deleted the push-zuzvxtrytpyn branch January 3, 2026 01:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: pre-allocate hash maps #732

perf: pre-allocate hash maps #732

Uh oh!

samueltardieu commented Jan 2, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 2, 2026 •

edited

Loading

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

codspeed-hq bot commented Jan 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

Copilot AI Jan 2, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	FxHashMap::with_capacity_and_hasher(estimated_capacity, FxBuildHasher);
	FxHashMap::with_capacity(estimated_capacity);

perf: pre-allocate hash maps #732

perf: pre-allocate hash maps #732

Uh oh!

Conversation

samueltardieu commented Jan 2, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

codspeed-hq bot commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #732 will improve performance by ×2

Summary

Benchmarks breakdown

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samueltardieu commented Jan 2, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 2, 2026 •

edited

Loading

codspeed-hq bot commented Jan 2, 2026 •

edited

Loading