-
Notifications
You must be signed in to change notification settings - Fork 22
⚡️ Speed up method DbtAdapter.build_parent_map
by 3,421%
#861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
⚡️ Speed up method DbtAdapter.build_parent_map
by 3,421%
#861
Conversation
The optimized code achieves a dramatic **3420% speedup** by addressing two critical performance bottlenecks: **Primary Optimization: Avoiding expensive `to_dict()` conversion** - The original code calls `manifest.to_dict()` which consumes 98.4% of the total runtime (499ms out of 507ms) - The optimized version attempts to access `manifest.parent_map` directly, falling back to `to_dict()` only if needed - This eliminates the massive serialization overhead when the parent_map is already available as an attribute **Secondary Optimization: Converting to set for O(1) lookups** - Changes `node_ids = nodes.keys()` to `node_ids = set(nodes)` - Set membership tests (`if k not in node_ids`) are O(1) instead of O(n) for dict_keys views - While this optimization shows smaller gains in the profiler, it provides consistent performance improvements for larger node sets **Performance Impact:** - Total runtime drops from 41.5ms to 1.18ms - The `manifest.to_dict()` bottleneck is completely eliminated in the common case - Set-based membership checks are more efficient, especially as the number of nodes scales **Test Case Benefits:** The optimizations are particularly effective for: - Large dbt projects with many nodes (where set lookups show greater advantage) - Repeated calls to `build_parent_map` (avoiding repeated expensive dict conversions) - Any scenario where the manifest object already has a `parent_map` attribute available The try/except ensures backward compatibility while maximizing performance gains in the common case.
Codecov Report❌ Patch coverage is
... and 9 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @misrasaurabh1,
Thanks for the new optimization PR!
Except for the minor comment, others are good to me.
Could you please also fix the flake8 format issue and include your sign-off in the commit to pass the DCO check?
Thanks!
parent_map_source = manifest.to_dict()["parent_map"] | ||
|
||
node_ids = nodes.keys() | ||
node_ids = set(nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The keys view is set-like, so I think we don't need to change it to set
to improve the lookup time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair. making the edit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I mean we can keep the original node_ids = nodes.keys()
should be ready to merge |
Hi misrasaurabh1, And we need a sign-off in the commit messages to pass the DCO check. Thanks! |
📄 3,421% (34.21x) speedup for
DbtAdapter.build_parent_map
inrecce/adapter/dbt_adapter/__init__.py
⏱️ Runtime :
41.5 milliseconds
→1.18 milliseconds
(best of21
runs)📝 Explanation and details
The optimized code achieves a dramatic 3420% speedup by addressing two critical performance bottlenecks:
Primary Optimization: Avoiding expensive
to_dict()
conversionmanifest.to_dict()
which consumes 98.4% of the total runtime (499ms out of 507ms)manifest.parent_map
directly, falling back toto_dict()
only if neededSecondary Optimization: Converting to set for O(1) lookups
node_ids = nodes.keys()
tonode_ids = set(nodes)
if k not in node_ids
) are O(1) instead of O(n) for dict_keys viewsPerformance Impact:
manifest.to_dict()
bottleneck is completely eliminated in the common caseTest Case Benefits:
The optimizations are particularly effective for:
build_parent_map
(avoiding repeated expensive dict conversions)parent_map
attribute availableThe try/except ensures backward compatibility while maximizing performance gains in the common case.
✅ Correctness verification report:
⏪ Replay Tests and Runtime
test_pytest_tests__replay_test_0.py::test_recce_adapter_dbt_adapter___init___DbtAdapter_build_parent_map
To edit these changes
git checkout codeflash/optimize-DbtAdapter.build_parent_map-med9gwbs
and push.