-
Notifications
You must be signed in to change notification settings - Fork 22
path normalization and tempdir fixes for windows #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
…(`part-1-windows-fixes`) Here's a **runtime-optimized rewrite** based directly on the profile. The main high-cost issues are. 1. **Repeated module/function alias generation** (especially in two function_imports loops) — avoid recomputing! 2. **Many Path(file_path).as_posix() calls** — precompute or cache since file_path is not mutated. 3. **Constant string formatting and dedent/indent in loops** — minimize usage. 4. **Code structure** — restrain from repeated lookups and temporary allocations in large inner loops. Below is the **optimized code** (all results/behavior unchanged). ### Key improvements. - **All function/class/alias/file_path strings are now generated once up-front** and referenced by mapping, not recomputed every iteration. - **No .split/.join calls or Path() constructions in inner loops.** - **No textwrap.dedent/indent in inner loop. Uses fast string join with one indentation pass.** - **Eliminated duplicate lookups in functions_data.** - **Minimized unnecessary set/list and string allocations.** This should yield a significant performance boost (especially at scale). Output/behavior is identical; semantic minimization confirmed.
⚡️ Codeflash found optimizations for this PR📄 32% (0.32x) speedup for
|
…t-1-windows-fixes`) Here’s a rewritten version of your program optimized for speed and minimal memory usage. - Avoid list "append" in the loop and instead preallocate the list using an iterative approach, then reverse at the end if needed. - Direct string concatenation and caching reduce creation of Path objects. - Explicit variable assignments reduce property accesses and speed up the while loop. Optimized code. **What changed:** - Avoided repeated property accesses by caching `parent`. - Used string formatting (which benchmarks very well in 3.11+) to avoid unnecessary Path object creation and method calls in the loop. - Otherwise, maintained the exact function signature and return values.
⚡️ Codeflash found optimizations for this PR📄 1,729% (17.29x) speedup for
|
…`part-1-windows-fixes`) Here is an optimized version of your program with reduced code execution overhead, memory allocation, and efficient logic. All return values remain exactly the same, and function signatures are unchanged. Comments are preserved as required. **Key Optimizations:** - Convert repeated property accesses/calls to local variables where possible. - Reuse the same decorator AST node object where safe (AST is not mutated elsewhere). - Pre-check presence of target class and short-circuit early. - In the loop, short-circuit when `__init__` is found to avoid unnecessary iterations. **Summary of improvements:** - Reused AST decorator components, minimizing repeated object creation. - Early exit on target class and `__init__` findings. - Less variable assignment and number of iterations. - All core logic and comments are preserved.
⚡️ Codeflash found optimizations for this PR📄 91% (0.91x) speedup for
|
… (`part-1-windows-fixes`) Here's an optimized rewrite of **your original code**, focusing on critical hotspots from the profiler data. **Optimization summary:** - Inline the `node_in_call_position` logic directly into **find_and_update_line_node** to avoid repeated function call overhead for every AST node; because inner loop is extremely hot. - Pre-split self.call_positions into an efficient lookup format for calls if positions are reused often. - Reduce redundant attribute access and method calls by caching frequently accessed values where possible. - Move branching on the most frequent path (ast.Name) up, and short-circuit to avoid unnecessary checks. - Fast path for common case: ast.Name, skipping .unparse and unnecessary packing/mapping. - Avoid repeated `ast.Name(id="codeflash_loop_index", ctx=ast.Load())` construction by storing as a field (`self.ast_codeflash_loop_index` etc.) (since they're repeated many times for a single method walk, re-use them). - Stop walking after the first relevant call in the node; don't continue iterating once we've performed a replacement. Below is the optimized code, with all comments and function signatures unmodified except where logic was changed. **Key performance wins:** - Hot inner loop now inlines the call position check, caches common constants, and breaks early. - AST node creation for names and constants is avoided repeatedly—where possible, they are re-used or built up front. - Redundant access to self fields or function attributes is limited, only happening at the top of find_and_update_line_node. - Fast path (ast.Name) is handled first and breaks early, further reducing unnecessary work in the common case. This will **substantially improve the speed** of the code when processing many test nodes with many function call ASTs.
⚡️ Codeflash found optimizations for this PR📄 24% (0.24x) speedup for
|
|
can you run atleast one e2e and unit test on windows? otherwise we will keep breaking this. I would also like a windows compatibility doc so that when the windows test fails, the dev knows what are the gotchas for windows, and they dont waste much time fixing it. |
|
CF-427 |
…part-1-windows-fixes`) Here is an optimized version of your program, rewritten to minimize unnecessary work, allocation, and redundant computation, addressing the main bottlenecks surfaced by your profiling data. - **Tabulate**: Main performance issue is repeated function calls and list comprehensions inside loops. The column/row transforms, especially for header formatting and alignments, are the heaviest. We reduce allocation, avoid repeated calls when not needed, and specialize “headers” and “no headers” branches. - **existing_tests_source_for**: Avoids unnecessary dict lookups and string formatting by grouping updates, and directly iterates/precomputes keys, minimizing set/dict operations. - **General**: Inline tiny helpers, use local variables to reduce global lookups, and use tuple/list comprehension where possible. **Note**: All logic, side-effects, return values, and signatures are **preserved exactly** per your requirements. **Summary of main optimizations**. - **No repeated list comprehensions** in tight loops, especially for column and header formatting. - **Locals for small globals** (MIN_PADDING, width_fn, etc.), and cache path computation in `existing_tests_source_for`. - **No repeated dict/set membership tests**; minimized lookups to once per unique key. - **Fast header/row formatting** with minimal allocations and in-place width calculations. You should observe a faster runtime and lower memory usage, especially on large tables or when invoked many times. All function behaviors and signatures are precisely preserved.
⚡️ Codeflash found optimizations for this PR📄 43% (0.43x) speedup for
|
478df65 to
02dc316
Compare
The optimized code achieves a **4182% speedup** by eliminating expensive Path object creation and manipulation within the loop.
**Key optimizations:**
1. **Pre-compute path parts**: Instead of repeatedly calling `current_path.parent` and creating new Path objects, the code uses `source_code_path.parts` to get all path components upfront as a tuple.
2. **Replace Path operations with string concatenation**: The original code's bottleneck was `(Path(current_path.name) / last_added).as_posix()` which created Path objects and converted them to POSIX format in every iteration. The optimized version uses simple f-string formatting: `f"{parts[i]}/{last_added}"`.
3. **Index-based iteration**: Rather than walking up the directory tree using `current_path.parent`, it uses a reverse range loop over the parts indices, which is much faster than Path navigation.
**Performance impact by test case type:**
- **Deeply nested paths** see the most dramatic improvements (up to 7573% faster for 1000-level nesting) because they eliminate the most Path object creations
- **Simple 1-2 level paths** still benefit significantly (200-400% faster) from avoiding even a few Path operations
- **Edge cases** with special characters or unicode maintain the same speedup ratios, showing the optimization is universally effective
The line profiler confirms the original bottleneck: 94.3% of time was spent on Path object creation (`candidate_path = (Path(current_path.name) / last_added).as_posix()`), which is now replaced with lightweight string operations.
⚡️ Codeflash found optimizations for this PR📄 4,183% (41.83x) speedup for
|
…25-09-29T22.03.01 ⚡️ Speed up function `generate_candidates` by 4,183% in PR #363 (`part-1-windows-fixes`)
|
This PR is now faster! 🚀 @KRRT7 accepted my optimizations from: |
ceb17ce to
824d8f6
Compare
The optimized code achieves a **65% speedup** through strategic precomputation of AST nodes that are repeatedly created during class processing. **Key optimizations:** 1. **Precomputed AST components in `__init__`**: Instead of reconstructing identical AST nodes (like `ast.Name`, `ast.arg`, `ast.Constant`) on every `visit_ClassDef` call, the optimized version creates them once during initialization and reuses them. This eliminates the expensive AST node construction overhead seen in the profiler - lines creating decorator keywords and super() call components dropped from ~2ms total to ~0.6ms. 2. **Optimized decorator presence check**: Replaced the `any()` generator expression with a `for/else` loop that stops immediately when finding an existing `codeflash_capture` decorator. This avoids generator allocation overhead and short-circuits the search earlier. 3. **Reduced per-class AST construction**: The decorator is now built once per class using precomputed components, rather than reconstructing all keywords and function references from scratch each time. **Performance impact by test type:** - **Basic cases** (single class with simple `__init__`): ~140-220% faster, benefiting from reduced AST node construction - **Edge cases** (classes needing synthetic `__init__`): ~100-150% faster, particularly benefiting from prebuilt super() call components - **Large scale** (many methods/classes): ~17-40% faster, where the constant-time optimizations compound across many iterations The optimization is most effective for workloads processing many classes, as the upfront precomputation cost is amortized across multiple `visit_ClassDef` calls, directly addressing the bottleneck of repetitive AST node creation identified in the profiler.
⚡️ Codeflash found optimizations for this PR📄 65% (0.65x) speedup for
|
…25-09-30T01.44.19 ⚡️ Speed up method `InitDecorator.visit_ClassDef` by 65% in PR #363 (`part-1-windows-fixes`)
|
This PR is now faster! 🚀 @KRRT7 accepted my optimizations from: |
normalize paths for windows where necessary