Skip to content

Commit d305de8

Browse files
Optimize generate_candidates
The optimized code achieves a **4182% speedup** by eliminating expensive Path object creation and manipulation within the loop. **Key optimizations:** 1. **Pre-compute path parts**: Instead of repeatedly calling `current_path.parent` and creating new Path objects, the code uses `source_code_path.parts` to get all path components upfront as a tuple. 2. **Replace Path operations with string concatenation**: The original code's bottleneck was `(Path(current_path.name) / last_added).as_posix()` which created Path objects and converted them to POSIX format in every iteration. The optimized version uses simple f-string formatting: `f"{parts[i]}/{last_added}"`. 3. **Index-based iteration**: Rather than walking up the directory tree using `current_path.parent`, it uses a reverse range loop over the parts indices, which is much faster than Path navigation. **Performance impact by test case type:** - **Deeply nested paths** see the most dramatic improvements (up to 7573% faster for 1000-level nesting) because they eliminate the most Path object creations - **Simple 1-2 level paths** still benefit significantly (200-400% faster) from avoiding even a few Path operations - **Edge cases** with special characters or unicode maintain the same speedup ratios, showing the optimization is universally effective The line profiler confirms the original bottleneck: 94.3% of time was spent on Path object creation (`candidate_path = (Path(current_path.name) / last_added).as_posix()`), which is now replaced with lightweight string operations.
1 parent f978a40 commit d305de8

File tree

1 file changed

+15
-7
lines changed

1 file changed

+15
-7
lines changed

codeflash/code_utils/coverage_utils.py

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -44,16 +44,24 @@ def build_fully_qualified_name(function_name: str, code_context: CodeOptimizatio
4444
def generate_candidates(source_code_path: Path) -> set[str]:
4545
"""Generate all the possible candidates for coverage data based on the source code path."""
4646
candidates = set()
47-
candidates.add(source_code_path.name)
48-
current_path = source_code_path.parent
49-
50-
last_added = source_code_path.name
51-
while current_path != current_path.parent:
52-
candidate_path = (Path(current_path.name) / last_added).as_posix()
47+
# Add the filename as a candidate
48+
name = source_code_path.name
49+
candidates.add(name)
50+
51+
# Precompute parts for efficient candidate path construction
52+
parts = source_code_path.parts
53+
n = len(parts)
54+
55+
# Walk up the directory structure without creating Path objects or repeatedly converting to posix
56+
last_added = name
57+
# Start from the last parent and move up to the root, exclusive (skip the root itself)
58+
for i in range(n - 2, 0, -1):
59+
# Combine the ith part with the accumulated path (last_added)
60+
candidate_path = f"{parts[i]}/{last_added}"
5361
candidates.add(candidate_path)
5462
last_added = candidate_path
55-
current_path = current_path.parent
5663

64+
# Add the absolute posix path as a candidate
5765
candidates.add(source_code_path.as_posix())
5866
return candidates
5967

0 commit comments

Comments
 (0)