Skip to content

Conversation

misrasaurabh1
Copy link

📄 3,421% (34.21x) speedup for DbtAdapter.build_parent_map in recce/adapter/dbt_adapter/__init__.py

⏱️ Runtime : 41.5 milliseconds 1.18 milliseconds (best of 21 runs)

📝 Explanation and details

The optimized code achieves a dramatic 3420% speedup by addressing two critical performance bottlenecks:

Primary Optimization: Avoiding expensive to_dict() conversion

  • The original code calls manifest.to_dict() which consumes 98.4% of the total runtime (499ms out of 507ms)
  • The optimized version attempts to access manifest.parent_map directly, falling back to to_dict() only if needed
  • This eliminates the massive serialization overhead when the parent_map is already available as an attribute

Secondary Optimization: Converting to set for O(1) lookups

  • Changes node_ids = nodes.keys() to node_ids = set(nodes)
  • Set membership tests (if k not in node_ids) are O(1) instead of O(n) for dict_keys views
  • While this optimization shows smaller gains in the profiler, it provides consistent performance improvements for larger node sets

Performance Impact:

  • Total runtime drops from 41.5ms to 1.18ms
  • The manifest.to_dict() bottleneck is completely eliminated in the common case
  • Set-based membership checks are more efficient, especially as the number of nodes scales

Test Case Benefits:
The optimizations are particularly effective for:

  • Large dbt projects with many nodes (where set lookups show greater advantage)
  • Repeated calls to build_parent_map (avoiding repeated expensive dict conversions)
  • Any scenario where the manifest object already has a parent_map attribute available

The try/except ensures backward compatibility while maximizing performance gains in the common case.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 🔘 None Found
⏪ Replay Tests 34 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_tests__replay_test_0.py::test_recce_adapter_dbt_adapter___init___DbtAdapter_build_parent_map 40.7ms 233μs 17297%✅

To edit these changes git checkout codeflash/optimize-DbtAdapter.build_parent_map-med9gwbs and push.

Codeflash

codeflash-ai bot and others added 2 commits August 15, 2025 20:08
The optimized code achieves a dramatic **3420% speedup** by addressing two critical performance bottlenecks:

**Primary Optimization: Avoiding expensive `to_dict()` conversion**
- The original code calls `manifest.to_dict()` which consumes 98.4% of the total runtime (499ms out of 507ms)
- The optimized version attempts to access `manifest.parent_map` directly, falling back to `to_dict()` only if needed
- This eliminates the massive serialization overhead when the parent_map is already available as an attribute

**Secondary Optimization: Converting to set for O(1) lookups**
- Changes `node_ids = nodes.keys()` to `node_ids = set(nodes)`
- Set membership tests (`if k not in node_ids`) are O(1) instead of O(n) for dict_keys views
- While this optimization shows smaller gains in the profiler, it provides consistent performance improvements for larger node sets

**Performance Impact:**
- Total runtime drops from 41.5ms to 1.18ms
- The `manifest.to_dict()` bottleneck is completely eliminated in the common case
- Set-based membership checks are more efficient, especially as the number of nodes scales

**Test Case Benefits:**
The optimizations are particularly effective for:
- Large dbt projects with many nodes (where set lookups show greater advantage)
- Repeated calls to `build_parent_map` (avoiding repeated expensive dict conversions)
- Any scenario where the manifest object already has a `parent_map` attribute available

The try/except ensures backward compatibility while maximizing performance gains in the common case.
Copy link

codecov bot commented Aug 19, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
recce/adapter/dbt_adapter/__init__.py 66.66% 2 Missing ⚠️
Files with missing lines Coverage Δ
recce/adapter/dbt_adapter/__init__.py 74.09% <66.66%> (-0.12%) ⬇️

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wcchang1115 wcchang1115 self-requested a review August 21, 2025 04:53
Copy link
Collaborator

@wcchang1115 wcchang1115 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @misrasaurabh1,

Thanks for the new optimization PR!
Except for the minor comment, others are good to me.

Could you please also fix the flake8 format issue and include your sign-off in the commit to pass the DCO check?
Thanks!

parent_map_source = manifest.to_dict()["parent_map"]

node_ids = nodes.keys()
node_ids = set(nodes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The keys view is set-like, so I think we don't need to change it to set to improve the lookup time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair. making the edit

Copy link
Collaborator

@wcchang1115 wcchang1115 Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I mean we can keep the original node_ids = nodes.keys()

@misrasaurabh1
Copy link
Author

should be ready to merge

@wcchang1115
Copy link
Collaborator

wcchang1115 commented Aug 22, 2025

Hi misrasaurabh1,
I would prefer to keep the node_ids = nodes.keys() as is for readability.
Could you please help edit it and fix the failed CI jobs? thanks!

And we need a sign-off in the commit messages to pass the DCO check.
You could provide your commit with git commit --signoff or git commit -s

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants