perf(query-planner): optimize InDegree with Vec-based storage #624

rhzs · 2025-12-16T01:16:19Z

Replaces HashMap with a boxed slice using in_degree+1 encoding, enabling calloc-style zero-initialization.
Converts panics to Result-based error handling.
Avoids String allocation in BestPathTracker by using borrowed &str references.

Replaces HashMap with a boxed slice using in_degree+1 encoding, enabling calloc-style zero-initialization. Converts panics to Result-based error handling. Also avoids String allocation in BestPathTracker by using borrowed &str references.

gemini-code-assist · 2025-12-16T01:16:35Z

Summary of Changes

Hello @rhzs, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces several performance enhancements and error handling improvements within the query planner. The InDegree calculation mechanism has been refactored to use a more efficient, vector-backed storage that benefits from optimized zero-initialization. Concurrently, error management in InDegree operations has been upgraded from panics to Result types, enhancing the system's resilience. Additionally, the BestPathTracker has been optimized to reduce memory allocations by leveraging borrowed string slices.

Highlights

Performance Optimization (InDegree): Replaced the HashMap storage in InDegree with a Box<[u32]> (a vector-based approach). This change utilizes an in_degree + 1 encoding, allowing for calloc-style zero-initialization which can be more memory efficient.
Improved Error Handling: Converted panic-based error handling in InDegree methods (mark_as_processed, is_fulfilled) to Result-based error handling, making the code more robust and allowing callers to gracefully handle potential issues.
Performance Optimization (BestPathTracker): Eliminated String allocations in BestPathTracker by switching from BTreeMap<String, ...> to BTreeMap<&'graph str, ...>, using borrowed string references instead of owned String objects.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces several significant performance and robustness improvements. The switch from HashMap to a Box<[u32]> in InDegree is a great optimization that reduces allocation and hashing overhead, aligning well with the performance-first ethos. The conversion of panics to Result-based error handling greatly improves the robustness of the query planner. Additionally, avoiding String allocation in BestPathTracker by using &str is another good performance gain.

The code is well-written and the changes are solid. I have one suggestion in optimize_plan_sequence to improve readability by avoiding variable shadowing and using unreachable! for logically unreachable code paths, which would make the intent clearer.

gemini-code-assist · 2025-12-16T01:18:14Z

lib/query-planner/src/planner/query_plan.rs

+            ([.., PlanNode::Condition(last)], PlanNode::Condition(curr))
+                if are_conditions_compatible(last, curr) =>
            {
-                let last_node = acc.pop().unwrap();
-                let merged_node = merge_two_condition_nodes(last_node, current_node);
-                acc.push(merged_node);
+                // Pop the last element - we know it exists and is a Condition from the pattern
+                let Some(PlanNode::Condition(last)) = acc.pop() else {
+                    // This branch is logically unreachable due to the slice pattern above
+                    acc.push(current_node);
+                    return acc;
+                };
+                let PlanNode::Condition(curr) = current_node else {
+                    // This branch is logically unreachable due to the match above
+                    return acc;
+                };
+                acc.push(merge_two_condition_nodes(last, curr));
            }


While this implementation is correct, it can be made more readable and idiomatic.

The variable names last and curr are shadowed inside the block, which can be slightly confusing. Renaming them to e.g., last_ref/curr_ref in the pattern and last_owned/current_owned inside would improve clarity, adhering to the spirit of Rule H in the style guide regarding clear naming.

The else blocks for the let-else statements contain logic that is commented as unreachable. It's more idiomatic to use unreachable!() in these cases to make it clear to the compiler and other developers that these paths should never be taken. This also makes the code cleaner.

Here is a suggested improvement:

([.., PlanNode::Condition(last_ref)], PlanNode::Condition(current_ref)) if are_conditions_compatible(last_ref, current_ref) => { // Pop the last element - we know it exists and is a Condition from the pattern let Some(PlanNode::Condition(last_owned)) = acc.pop() else { // This branch is logically unreachable due to the slice pattern above unreachable!("The slice pattern guarantees the last element is a ConditionNode."); }; let PlanNode::Condition(current_owned) = current_node else { // This branch is logically unreachable due to the match above unreachable!("The match pattern guarantees the current node is a ConditionNode."); }; acc.push(merge_two_condition_nodes(last_owned, current_owned)); }

References

The use of variable shadowing for last and curr can make the code slightly harder to follow. Using distinct names for the borrowed references in the pattern and the owned values inside the block improves clarity, in line with the principle of naming that reveals intent (Rule H). ^(link)

dotansimha

Thank you for your contribution @rhzs ! Can you please share the background for this PR? Did you encounter some performance issues?

Replaces HashMap with a boxed slice using in_degree+1 encoding, enabling calloc-style zero-initialization.

While this PR seems appealing, I need your help with a few things to clear on this.
I do understand the benefit of using zero allocations, but I'm not sure if the benefit is so significant in that area of the code, especially because it adds some additional complexity.

Can you please elaborate on how you tested the performance improvement? It will help me to understand if the added complexity is worth it.

Converts panics to Result-based error handling.

✅

Avoids String allocation in BestPathTracker by using borrowed &str references.

✅

dotansimha · 2026-01-18T15:21:15Z

Hi @rhzs , as part of house keeping, I'm going to close this one. Please let me know if you have answers to my previous questions, and then we can reopen.
Thanks!

gemini-code-assist bot reviewed Dec 16, 2025

View reviewed changes

refactor(query-plan): clarify condition merge arm

dbabaa8

ardatan requested a review from kamilkisiela December 16, 2025 01:46

fmt(*): fix format

f13f1d4

Urigo requested a review from dotansimha January 5, 2026 16:13

dotansimha reviewed Jan 8, 2026

View reviewed changes

dotansimha closed this Jan 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(query-planner): optimize InDegree with Vec-based storage #624

perf(query-planner): optimize InDegree with Vec-based storage #624

Uh oh!

rhzs commented Dec 16, 2025

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 16, 2025

Uh oh!

dotansimha left a comment

Uh oh!

dotansimha commented Jan 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf(query-planner): optimize InDegree with Vec-based storage #624

perf(query-planner): optimize InDegree with Vec-based storage #624

Uh oh!

Conversation

rhzs commented Dec 16, 2025

Uh oh!

gemini-code-assist bot commented Dec 16, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

dotansimha left a comment

Choose a reason for hiding this comment

Uh oh!

dotansimha commented Jan 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants