-
Notifications
You must be signed in to change notification settings - Fork 22
Behavior Test Instrumentation to account for input mutation #867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
PR Reviewer Guide 🔍(Review updated until commit ffff5f1)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to ffff5f1
Previous suggestionsSuggestions up to commit b91b61f
|
|
Codeflash Bot seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
|
Persistent review updated to latest commit ffff5f1 |
The optimized code achieves a **22% speedup** through two main optimizations that reduce overhead in AST traversal and attribute lookups: **1. Custom AST traversal replaces expensive `ast.walk()`** The original code uses `ast.walk()` which creates recursive stack frames for every AST node. The optimized version implements `iter_ast_calls()` - a manual iterative traversal that only visits `ast.Call` nodes using a single stack. This eliminates Python's recursion overhead and reduces the O(N) stack frame creation to a single stack operation. **2. Reduced attribute lookups in hot paths** - In `node_in_call_position()`: Uses `getattr()` with defaults to cache node attributes (`node_lineno`, `node_end_lineno`, etc.) instead of repeated `hasattr()` + attribute access - In `find_and_update_line_node()`: Hoists frequently-accessed object attributes (`fn_obj.qualified_name`, `self.mode`, etc.) to local variables before the loop - Pre-creates reusable AST nodes (`codeflash_loop_index`, `codeflash_cur`, `codeflash_con`) instead of recreating them in each iteration **Performance characteristics:** - **Small AST trees** (basic function calls): 5-28% faster due to reduced attribute lookups - **Large AST trees** (deeply nested calls): 18-26% faster due to more efficient traversal avoiding `ast.walk()` - **Large call position lists**: 26% faster due to optimized position checking with cached attributes The optimizations are most effective for complex test instrumentation scenarios with large AST trees or many call positions to check, which is typical in code analysis and transformation workflows.
⚡️ Codeflash found optimizations for this PR📄 22% (0.22x) speedup for
|
…25-11-01T00.02.02 ⚡️ Speed up method `InjectPerfOnly.find_and_update_line_node` by 22% in PR #867 (`inspect-signature-issue`)
|
This PR is now faster! 🚀 @misrasaurabh1 accepted my optimizations from: |
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
|
This PR is now faster! 🚀 Aseem Saxena accepted my code suggestion above. |
The optimized code achieves a **17% speedup** through several targeted micro-optimizations that reduce attribute lookups and method resolution overhead in the AST traversal hot path: **Key Optimizations:** 1. **Cached attribute lookups in `__init__`**: The construction loop now caches method references (`add_dot_methods = self._dot_methods.setdefault`) to avoid repeated attribute resolution during the preprocessing phase. 2. **Single `getattr` call with None fallback**: Replaced repeated `isinstance(node_value, ast.Name)` checks and `node_value.id` accesses with a single `val_id = getattr(node_value, "id", None)` call. This eliminates redundant type checking and attribute lookups. 3. **Direct base class method calls**: Changed `self.generic_visit(node)` to `ast.NodeVisitor.generic_visit(self, node)` to bypass Python's method resolution and attribute lookup on `self`, providing faster direct method invocation. 4. **Restructured control flow**: Combined the imported modules check with the function name lookup in a single conditional branch, reducing the number of separate `isinstance` calls from the original nested structure. **Performance Impact:** - The line profiler shows the most expensive line (`self.generic_visit(node)`) dropped from 9.86ms to 8.80ms (10.8% improvement) - The `generic_visit` method itself became 40% faster (5.04ms → 2.99ms) due to direct base class calls - Test results show consistent 8-17% improvements across various scenarios, with the largest gains (up to 23.6%) in complex cases involving multiple lookups **Best Use Cases:** The optimization is most effective for: - Large ASTs with many attribute nodes (as shown in the large-scale tests) - Codebases with extensive import analysis where `visit_Attribute` is called frequently - Scenarios with many non-matching attributes, where the fast-path optimizations provide the most benefit The changes preserve all original functionality while eliminating Python overhead in this performance-critical AST traversal code.
⚡️ Codeflash found optimizations for this PR📄 17% (0.17x) speedup for
|
…25-11-05T08.18.39 ⚡️ Speed up method `ImportAnalyzer.visit_Attribute` by 17% in PR #867 (`inspect-signature-issue`)
|
This PR is now faster! 🚀 @aseembits93 accepted my optimizations from: |
The optimization replaces the standard `ast.NodeVisitor.generic_visit` call with a custom `_fast_generic_visit` method that inlines the AST traversal logic, eliminating method resolution overhead and adding more aggressive early-exit checks. **Key Performance Improvements:** 1. **Eliminated Method Resolution Overhead**: The original code called `ast.NodeVisitor.generic_visit(self, node)` which incurs method lookup and dispatch costs. The optimized version inlines this logic directly, avoiding the base class method call entirely. 2. **More Frequent Early Exit Checks**: The new `_fast_generic_visit` checks `self.found_any_target_function` at multiple points during traversal (before processing lists and individual AST nodes), allowing faster short-circuiting when a target function is found. 3. **Optimized Attribute Access**: The optimization uses direct `getattr` calls and caches method lookups (`getattr(self, 'visit_' + item.__class__.__name__, None)`) to reduce repeated attribute resolution. **Performance Impact by Test Case:** - **Large-scale tests** show the biggest gains (27-35% faster) because they process many AST nodes where the traversal overhead compounds - **Basic tests** with fewer nodes show moderate improvements (9-20% faster) - **Edge cases** with complex nesting benefit from the more frequent early-exit checks The line profiler shows the optimization reduces time spent in `generic_visit` from 144.2ms to 107.9ms (25% improvement), with the overall `visit_Call` method improving from 287.5ms to 210.3ms. This optimization is particularly valuable for AST analysis tools that process large codebases, as the traversal overhead reduction scales with the size and complexity of the analyzed code.
⚡️ Codeflash found optimizations for this PR📄 31% (0.31x) speedup for
|
…25-11-05T08.40.37 ⚡️ Speed up method `ImportAnalyzer.visit_Call` by 31% in PR #867 (`inspect-signature-issue`)
|
This PR is now faster! 🚀 @aseembits93 accepted my optimizations from: |
⚡️ Codeflash found optimizations for this PR📄 3,438% (34.38x) speedup for
|
The optimization converts the recursive AST traversal from a call-stack based approach to an iterative one using a manual stack, delivering a 44% performance improvement.
**Key optimizations applied:**
1. **Stack-based iteration replaces recursion**: The original code used recursive calls to `_fast_generic_visit()` and `meth()` for AST traversal. The optimized version uses a manual stack with `while` loop iteration, eliminating function call overhead and stack frame management costs.
2. **Faster method resolution**: Replaced `getattr(self, "visit_" + classname, None)` with `type(self).__dict__.get("visit_" + classname)`, which is significantly faster for method lookup. The class dictionary lookup avoids the more expensive attribute resolution pathway.
3. **Local variable caching**: Pre-cached frequently accessed attributes like `stack.append`, `stack.pop`, and `type(self).__dict__` into local variables to reduce repeated attribute lookups during the tight inner loop.
**Why this leads to speedup:**
- **Reduced function call overhead**: Each recursive call in the original version creates a new stack frame with associated setup/teardown costs. The iterative approach eliminates this entirely.
- **Faster method resolution**: Dictionary `.get()` is ~2-3x faster than `getattr()` for method lookups, especially important since this happens for every AST node visited.
- **Better cache locality**: The manual stack keeps traversal state in a more compact, cache-friendly format compared to Python's call stack.
**Performance characteristics from test results:**
The optimization shows variable performance depending on AST structure:
- **Large nested trees**: 39.2% faster (deep recursion → iteration benefit is maximized)
- **Early exit scenarios**: 57% faster on large trees (stack-based approach handles early termination more efficiently)
- **Simple nodes**: Some overhead for very small cases due to setup costs, but still performs well on realistic workloads
- **Complex traversals**: 14-24% faster on typical code structures with mixed node types
This optimization is particularly valuable for AST analysis tools that process large codebases, where the cumulative effect of faster traversal becomes significant.
⚡️ Codeflash found optimizations for this PR📄 44% (0.44x) speedup for
|
…25-11-05T09.44.48 ⚡️ Speed up method `ImportAnalyzer._fast_generic_visit` by 44% in PR #867 (`inspect-signature-issue`)
|
This PR is now faster! 🚀 @aseembits93 accepted my optimizations from: |
PR Type
Enhancement, Tests
Description
Bind call args via inspect.signature
Apply defaults to bound arguments
Route args/kwargs through wrapper in behavior mode
Add inspect import for instrumentation
Diagram Walkthrough
File Walkthrough
instrument_existing_tests.py
Bind call args and pass via wrappercodeflash/code_utils/instrument_existing_tests.py
FunctionCallNodeArguments.inspectfor new binding logic.