-
Notifications
You must be signed in to change notification settings - Fork 6
perf(query-planner): optimize InDegree with Vec-based storage #624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Replaces HashMap with a boxed slice using in_degree+1 encoding, enabling calloc-style zero-initialization. Converts panics to Result-based error handling. Also avoids String allocation in BestPathTracker by using borrowed &str references.
Summary of ChangesHello @rhzs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several performance enhancements and error handling improvements within the query planner. The Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces several significant performance and robustness improvements. The switch from HashMap to a Box<[u32]> in InDegree is a great optimization that reduces allocation and hashing overhead, aligning well with the performance-first ethos. The conversion of panics to Result-based error handling greatly improves the robustness of the query planner. Additionally, avoiding String allocation in BestPathTracker by using &str is another good performance gain.
The code is well-written and the changes are solid. I have one suggestion in optimize_plan_sequence to improve readability by avoiding variable shadowing and using unreachable! for logically unreachable code paths, which would make the intent clearer.
| ([.., PlanNode::Condition(last)], PlanNode::Condition(curr)) | ||
| if are_conditions_compatible(last, curr) => | ||
| { | ||
| let last_node = acc.pop().unwrap(); | ||
| let merged_node = merge_two_condition_nodes(last_node, current_node); | ||
| acc.push(merged_node); | ||
| // Pop the last element - we know it exists and is a Condition from the pattern | ||
| let Some(PlanNode::Condition(last)) = acc.pop() else { | ||
| // This branch is logically unreachable due to the slice pattern above | ||
| acc.push(current_node); | ||
| return acc; | ||
| }; | ||
| let PlanNode::Condition(curr) = current_node else { | ||
| // This branch is logically unreachable due to the match above | ||
| return acc; | ||
| }; | ||
| acc.push(merge_two_condition_nodes(last, curr)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this implementation is correct, it can be made more readable and idiomatic.
- The variable names
lastandcurrare shadowed inside the block, which can be slightly confusing. Renaming them to e.g.,last_ref/curr_refin the pattern andlast_owned/current_ownedinside would improve clarity, adhering to the spirit of Rule H in the style guide regarding clear naming. - The
elseblocks for thelet-elsestatements contain logic that is commented as unreachable. It's more idiomatic to useunreachable!()in these cases to make it clear to the compiler and other developers that these paths should never be taken. This also makes the code cleaner.
Here is a suggested improvement:
([.., PlanNode::Condition(last_ref)], PlanNode::Condition(current_ref))
if are_conditions_compatible(last_ref, current_ref) =>
{
// Pop the last element - we know it exists and is a Condition from the pattern
let Some(PlanNode::Condition(last_owned)) = acc.pop() else {
// This branch is logically unreachable due to the slice pattern above
unreachable!("The slice pattern guarantees the last element is a ConditionNode.");
};
let PlanNode::Condition(current_owned) = current_node else {
// This branch is logically unreachable due to the match above
unreachable!("The match pattern guarantees the current node is a ConditionNode.");
};
acc.push(merge_two_condition_nodes(last_owned, current_owned));
}References
- The use of variable shadowing for
lastandcurrcan make the code slightly harder to follow. Using distinct names for the borrowed references in the pattern and the owned values inside the block improves clarity, in line with the principle of naming that reveals intent (Rule H). (link)
dotansimha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution @rhzs ! Can you please share the background for this PR? Did you encounter some performance issues?
Replaces HashMap with a boxed slice using in_degree+1 encoding, enabling calloc-style zero-initialization.
While this PR seems appealing, I need your help with a few things to clear on this.
I do understand the benefit of using zero allocations, but I'm not sure if the benefit is so significant in that area of the code, especially because it adds some additional complexity.
Can you please elaborate on how you tested the performance improvement? It will help me to understand if the added complexity is worth it.
Converts panics to Result-based error handling.
✅
Avoids String allocation in BestPathTracker by using borrowed &str references.
✅
|
Hi @rhzs , as part of house keeping, I'm going to close this one. Please let me know if you have answers to my previous questions, and then we can reopen. |
HashMapwith a boxed slice usingin_degree+1encoding, enabling calloc-style zero-initialization.Stringallocation inBestPathTrackerby using borrowed&strreferences.