You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
⚡️ Speed up method CommentMapper.visit_AsyncFunctionDef by 11% in PR #687 (granular-async-instrumentation)
The optimized code achieves an 11% speedup through several key micro-optimizations that reduce Python's runtime overhead:
**1. Cached Attribute/Dictionary Lookups**
The most impactful change is caching frequently accessed attributes and dictionaries as local variables:
- `context_stack = self.context_stack`
- `results = self.results`
- `original_runtimes = self.original_runtimes`
- `optimized_runtimes = self.optimized_runtimes`
- `get_comment = self.get_comment`
This eliminates repeated `self.` attribute lookups in the tight loops, which the profiler shows are called thousands of times (2,825+ iterations).
**2. Pre-cached Loop Bodies**
Caching `node_body = node.body` and `ln_body = line_node.body` before loops reduces attribute access overhead. The profiler shows these are accessed in nested loops with high hit counts.
**3. Optimized String Operations**
Using f-strings (`f"{test_qualified_name}#{self.abs_path}"`, `f"{i}_{j}"`) instead of string concatenation with `+` operators reduces temporary object creation and string manipulation overhead.
**4. Refined getattr Usage**
Changed from `getattr(compound_line_node, "body", [])` to `getattr(compound_line_node, 'body', None)` with a conditional check, avoiding allocation of empty lists when no body exists.
**Performance Impact by Test Type:**
- **Large-scale tests** show the biggest gains (14-117% faster) due to the cumulative effect of micro-optimizations in loops
- **Compound statement tests** benefit significantly (16-45% faster) from reduced attribute lookups in nested processing
- **Simple cases** show modest improvements (1-6% faster) as overhead reduction is less pronounced
- **Edge cases** with no matching runtimes benefit from faster loop traversal (3-12% faster)
The optimizations are most effective for functions with many statements or nested compound structures, where the tight loops amplify the benefit of reduced Python interpreter overhead.
0 commit comments