⚡️ Speed up method AsyncCallInstrumenter.visit_AsyncFunctionDef by 123% in PR #769 (clean-async-branch)
#780
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #769
If you approve this dependent PR, these changes will be merged into the original PR branch
clean-async-branch.📄 123% (1.23x) speedup for
AsyncCallInstrumenter.visit_AsyncFunctionDefincodeflash/code_utils/instrument_existing_tests.py⏱️ Runtime :
9.25 milliseconds→4.14 milliseconds(best of186runs)📝 Explanation and details
The optimized code achieves a 123% speedup by replacing expensive AST traversal operations with more efficient alternatives:
Key Optimizations:
Decorator Search Optimization: Replaced the
any()generator expression with a simple loop that breaks early when findingtimeout_decorator.timeout. This avoids unnecessary attribute lookups and iterations through the decorator list, especially beneficial when the decorator is found early or when there are many decorators.AST Traversal Replacement: The most significant optimization replaces
ast.walk(stmt)with a manual stack-based depth-first search in_optimized_instrument_statement(). The originalast.walk()creates a list of every node in the AST subtree, which is memory-intensive and includes many irrelevant nodes. The optimized version:_fieldsattribute accessast.Awaitnode that matches criteriaPerformance Impact by Test Case:
The line profiler shows the critical bottleneck was in
_instrument_statement()(96.4% of time originally), which is now reduced to 93.3% but with much lower absolute time, demonstrating the effectiveness of the AST traversal optimization.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr769-2025-09-27T02.50.03and push.