Add the test case timing to the generated test [CF-482] #294

misrasaurabh1 · 2025-06-06T02:46:15Z

User description

It creates PRs like #293 where the generated tests show all the timings.

PR Type

Enhancement, Tests

Description

Extract test removal utility to separate module
Integrate runtime comments into generated tests
Refactor optimizer to call removal and comment functions
Update test import path for removal utility

Changes walkthrough 📝

Relevant files

Enhancement

edit_generated_tests.py `Extract test removal utility` codeflash/code_utils/edit_generated_tests.py Add `remove_functions_from_generated_tests` utility Compile regex to strip specified test functions Return filtered `GeneratedTestsList`	[link]
function_optimizer.py `Integrate runtime comments into optimizer` codeflash/optimization/function_optimizer.py Update import for `remove_functions_from_generated_tests` Remove old removal call location Insert calls to removal and runtime comment methods Add `add_runtime_comments_to_generated_tests` method	+162/-7

Tests

test_remove_functions_from_generated_tests.py `Update test import path` tests/test_remove_functions_from_generated_tests.py Update import path to `edit_generated_tests` Remove outdated blank line	+1/-2

Need help?
Type /help how to ... in the comments thread for any questions about PR-Agent usage.
Check out the documentation for more information.

github-actions · 2025-06-06T02:47:25Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Missing Import The add_runtime_comments_to_generated_tests method uses `cst` (libcst) but there is no `import libcst as cst` in this file, causing a NameError at runtime. def add_runtime_comments_to_generated_tests( self, generated_tests: GeneratedTestsList, original_test_results: TestResults, optimized_test_results: TestResults, ) -> GeneratedTestsList: """Add runtime performance comments to function calls in generated tests.""" def format_time(nanoseconds: int) -> str: """Format nanoseconds into a human-readable string with 3 significant digits when needed.""" def count_significant_digits(num: int) -> int: """Count significant digits in an integer.""" return len(str(abs(num))) def format_with_precision(value: float, unit: str) -> str: """Format a value with 3 significant digits precision.""" if value >= 100: return f"{value:.0f}{unit}" if value >= 10: return f"{value:.1f}{unit}" return f"{value:.2f}{unit}" if nanoseconds < 1_000: return f"{nanoseconds}ns" if nanoseconds < 1_000_000: # Convert to microseconds microseconds_int = nanoseconds // 1_000 if count_significant_digits(microseconds_int) >= 3: return f"{microseconds_int}μs" microseconds_float = nanoseconds / 1_000 return format_with_precision(microseconds_float, "μs") if nanoseconds < 1_000_000_000: # Convert to milliseconds milliseconds_int = nanoseconds // 1_000_000 if count_significant_digits(milliseconds_int) >= 3: return f"{milliseconds_int}ms" milliseconds_float = nanoseconds / 1_000_000 return format_with_precision(milliseconds_float, "ms") # Convert to seconds seconds_int = nanoseconds // 1_000_000_000 if count_significant_digits(seconds_int) >= 3: return f"{seconds_int}s" seconds_float = nanoseconds / 1_000_000_000 return format_with_precision(seconds_float, "s") # Create dictionaries for fast lookup of runtime data original_runtime_by_test = original_test_results.usable_runtime_data_by_test_case() optimized_runtime_by_test = optimized_test_results.usable_runtime_data_by_test_case() class RuntimeCommentTransformer(cst.CSTTransformer): def __init__(self): self.in_test_function = False self.current_test_name = None def visit_FunctionDef(self, node: cst.FunctionDef) -> None: if node.name.value.startswith("test_"): self.in_test_function = True self.current_test_name = node.name.value else: self.in_test_function = False self.current_test_name = None def leave_FunctionDef( self, original_node: cst.FunctionDef, updated_node: cst.FunctionDef ) -> cst.FunctionDef: if original_node.name.value.startswith("test_"): self.in_test_function = False self.current_test_name = None return updated_node def leave_SimpleStatementLine( self, original_node: cst.SimpleStatementLine, updated_node: cst.SimpleStatementLine ) -> cst.SimpleStatementLine: if not self.in_test_function or not self.current_test_name: return updated_node # Look for assignment statements that assign to codeflash_output # Handle both single statements and multiple statements on one line codeflash_assignment_found = False for stmt in updated_node.body: if isinstance(stmt, cst.Assign): if ( len(stmt.targets) == 1 and isinstance(stmt.targets[0].target, cst.Name) and stmt.targets[0].target.value == "codeflash_output" ): codeflash_assignment_found = True break if codeflash_assignment_found: # Find matching test cases by looking for this test function name in the test results matching_original_times = [] matching_optimized_times = [] for invocation_id, runtimes in original_runtime_by_test.items(): if invocation_id.test_function_name == self.current_test_name: matching_original_times.extend(runtimes) for invocation_id, runtimes in optimized_runtime_by_test.items(): if invocation_id.test_function_name == self.current_test_name: matching_optimized_times.extend(runtimes) if matching_original_times and matching_optimized_times: original_time = min(matching_original_times) optimized_time = min(matching_optimized_times) # Create the runtime comment comment_text = f"# {format_time(original_time)} -> {format_time(optimized_time)}" # Add comment to the trailing whitespace new_trailing_whitespace = cst.TrailingWhitespace( whitespace=cst.SimpleWhitespace(" "), comment=cst.Comment(comment_text), newline=updated_node.trailing_whitespace.newline, ) return updated_node.with_changes(trailing_whitespace=new_trailing_whitespace) return updated_node # Process each generated test modified_tests = [] for test in generated_tests.generated_tests: try: # Parse the test source code tree = cst.parse_module(test.generated_original_test_source) # Transform the tree to add runtime comments transformer = RuntimeCommentTransformer() modified_tree = tree.visit(transformer) # Convert back to source code modified_source = modified_tree.code # Create a new GeneratedTests object with the modified source modified_test = GeneratedTests( generated_original_test_source=modified_source, instrumented_behavior_test_source=test.instrumented_behavior_test_source, instrumented_perf_test_source=test.instrumented_perf_test_source, behavior_file_path=test.behavior_file_path, perf_file_path=test.perf_file_path, ) modified_tests.append(modified_test) except Exception as e: # If parsing fails, keep the original test logger.debug(f"Failed to add runtime comments to test: {e}") modified_tests.append(test) return GeneratedTestsList(generated_tests=modified_tests) Missing Tests The new runtime comment feature isn’t covered by any unit tests—add tests to verify that comments are correctly injected into generated tests. ) cleanup_paths(paths_to_cleanup) if hasattr(get_run_tmp_file, "tmpdir"): get_run_tmp_file.tmpdir.cleanup() def add_runtime_comments_to_generated_tests( self, generated_tests: GeneratedTestsList, original_test_results: TestResults, optimized_test_results: TestResults, ) -> GeneratedTestsList: """Add runtime performance comments to function calls in generated tests.""" def format_time(nanoseconds: int) -> str: """Format nanoseconds into a human-readable string with 3 significant digits when needed.""" def count_significant_digits(num: int) -> int: """Count significant digits in an integer.""" return len(str(abs(num))) def format_with_precision(value: float, unit: str) -> str: """Format a value with 3 significant digits precision.""" if value >= 100: return f"{value:.0f}{unit}" if value >= 10: return f"{value:.1f}{unit}" return f"{value:.2f}{unit}" if nanoseconds < 1_000: return f"{nanoseconds}ns" if nanoseconds < 1_000_000: # Convert to microseconds microseconds_int = nanoseconds // 1_000 if count_significant_digits(microseconds_int) >= 3: return f"{microseconds_int}μs" microseconds_float = nanoseconds / 1_000 return format_with_precision(microseconds_float, "μs") if nanoseconds < 1_000_000_000: # Convert to milliseconds milliseconds_int = nanoseconds // 1_000_000 if count_significant_digits(milliseconds_int) >= 3: return f"{milliseconds_int}ms" milliseconds_float = nanoseconds / 1_000_000 return format_with_precision(milliseconds_float, "ms") # Convert to seconds seconds_int = nanoseconds // 1_000_000_000 if count_significant_digits(seconds_int) >= 3: return f"{seconds_int}s" seconds_float = nanoseconds / 1_000_000_000 return format_with_precision(seconds_float, "s") # Create dictionaries for fast lookup of runtime data original_runtime_by_test = original_test_results.usable_runtime_data_by_test_case() optimized_runtime_by_test = optimized_test_results.usable_runtime_data_by_test_case() class RuntimeCommentTransformer(cst.CSTTransformer): def __init__(self): self.in_test_function = False self.current_test_name = None def visit_FunctionDef(self, node: cst.FunctionDef) -> None: if node.name.value.startswith("test_"): self.in_test_function = True self.current_test_name = node.name.value else: self.in_test_function = False self.current_test_name = None def leave_FunctionDef( self, original_node: cst.FunctionDef, updated_node: cst.FunctionDef ) -> cst.FunctionDef: if original_node.name.value.startswith("test_"): self.in_test_function = False self.current_test_name = None return updated_node def leave_SimpleStatementLine( self, original_node: cst.SimpleStatementLine, updated_node: cst.SimpleStatementLine ) -> cst.SimpleStatementLine: if not self.in_test_function or not self.current_test_name: return updated_node # Look for assignment statements that assign to codeflash_output # Handle both single statements and multiple statements on one line codeflash_assignment_found = False for stmt in updated_node.body: if isinstance(stmt, cst.Assign): if ( len(stmt.targets) == 1 and isinstance(stmt.targets[0].target, cst.Name) and stmt.targets[0].target.value == "codeflash_output" ): codeflash_assignment_found = True break if codeflash_assignment_found: # Find matching test cases by looking for this test function name in the test results matching_original_times = [] matching_optimized_times = [] for invocation_id, runtimes in original_runtime_by_test.items(): if invocation_id.test_function_name == self.current_test_name: matching_original_times.extend(runtimes) for invocation_id, runtimes in optimized_runtime_by_test.items(): if invocation_id.test_function_name == self.current_test_name: matching_optimized_times.extend(runtimes) if matching_original_times and matching_optimized_times: original_time = min(matching_original_times) optimized_time = min(matching_optimized_times) # Create the runtime comment comment_text = f"# {format_time(original_time)} -> {format_time(optimized_time)}" # Add comment to the trailing whitespace new_trailing_whitespace = cst.TrailingWhitespace( whitespace=cst.SimpleWhitespace(" "), comment=cst.Comment(comment_text), newline=updated_node.trailing_whitespace.newline, ) return updated_node.with_changes(trailing_whitespace=new_trailing_whitespace) return updated_node # Process each generated test modified_tests = [] for test in generated_tests.generated_tests: try: # Parse the test source code tree = cst.parse_module(test.generated_original_test_source) # Transform the tree to add runtime comments transformer = RuntimeCommentTransformer() modified_tree = tree.visit(transformer) # Convert back to source code modified_source = modified_tree.code # Create a new GeneratedTests object with the modified source modified_test = GeneratedTests( generated_original_test_source=modified_source, instrumented_behavior_test_source=test.instrumented_behavior_test_source, instrumented_perf_test_source=test.instrumented_perf_test_source, behavior_file_path=test.behavior_file_path, perf_file_path=test.perf_file_path, ) modified_tests.append(modified_test) except Exception as e: # If parsing fails, keep the original test logger.debug(f"Failed to add runtime comments to test: {e}") modified_tests.append(test) return GeneratedTestsList(generated_tests=modified_tests) Regex Robustness The regex in remove_functions_from_generated_tests may overreach or skip decorated/parametrized functions; validate against edge cases and refine the pattern. function_pattern = re.compile( rf"(@pytest\.mark\.parametrize$.?$\s)?def\s+{re.escape(test_function)}$.?$:.?(?=\ndef\s\|$)", re.DOTALL, ) match = function_pattern.search(generated_test.generated_original_test_source) if match is None or "@pytest.mark.parametrize" in match.group(0):

github-actions · 2025-06-06T02:48:18Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Enable multiline regex matching Add the MULTILINE flag to the regex compilation so that the lookahead `(?=\ndef\s\|$)` can reliably match function boundaries across lines. This ensures functions are removed correctly even when definitions span multiple lines or include indentation. codeflash/code_utils/edit_generated_tests.py [12-15] function_pattern = re.compile( - rf"(@pytest\.mark\.parametrize$.?$\s)?def\s+{re.escape(test_function)}$.?$:.?(?=\ndef\s\|$)", - re.DOTALL, + rf"(@pytest\.mark\.parametrize$.?$\s)?def\s+{re.escape(test_function)}$.?$:.?(?=^def\s\|\Z)", + re.DOTALL \| re.MULTILINE, ) Suggestion importance[1-10]: 6 __ Why: Adding `re.MULTILINE` improves the regex boundary detection across lines and aligns with the intended lookahead semantics without major functional changes.	Low
General	Collapse redundant blank lines Collapse any leftover empty lines after removing a function to avoid leaving large gaps in the test source. A simple post‐substitution clean‐up can ensure the generated tests remain well‐formatted. codeflash/code_utils/edit_generated_tests.py [22-24] -generated_test.generated_original_test_source = function_pattern.sub( - "", generated_test.generated_original_test_source -) +cleaned = function_pattern.sub("", generated_test.generated_original_test_source) +# remove extra blank lines +generated_test.generated_original_test_source = re.sub(r"\n{3,}", "\n\n", cleaned) Suggestion importance[1-10]: 5 __ Why: Cleaning up extra blank lines after removal maintains test formatting and readability, though it’s a stylistic improvement with moderate impact.	Low

ruff reformat and fix linting

…(`add-timing-info-to-generated-tests`) Here's an optimized and faster version of your program. The main performance inefficiencies in the original code are. - **Repeated attribute accesses with `dict.get()` inside loops:** Pre-collecting values boosts efficiency. - **Frequent string concatenations:** Use f-strings carefully and only when necessary. - **Unnecessary use of `sorted` on a set each run.** Build this directly from the data. - **Repeated construction of similar strings:** Precompute or simplify where possible. - **Using `textwrap.indent` in a loop:** Combine with minimal copies. - **No need for `textwrap.dedent` if formatting is already explicit.** Below is the refactored code following these optimizations. **Summary of the changes:** - **Single pass for collecting imports and function names.** - **Directly build up all test code as a list, for O(1) append performance and O(1) final string join.** - **Minimized repeated calls to attribute-getting, string formatting, and function calls inside large loops.** - **Efficient, manual indentation instead of `textwrap.indent`.** - **Templates are constants, dedented only once.** - **All constants precomputed outside the loop.** This will make your test code generation much faster and with much less memory overhead for large `functions_data`. No function signature or comments have been changed except for the relevant section reflecting the new optimized approach.

codeflash-ai · 2025-06-06T03:43:34Z

⚡️ Codeflash found optimizations for this PR

📄 50% (0.50x) speedup for `create_trace_replay_test_code` in `codeflash/benchmarking/replay_test.py`

⏱️ Runtime : 11.5 milliseconds → 7.65 milliseconds (best of 348 runs)

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function create_trace_replay_test_code by 50% in PR #294 (add-timing-info-to-generated-tests) #295

If you approve, it will be merged into this PR (branch add-timing-info-to-generated-tests).

…o-to-generated-tests`) Certainly! Here is an optimized version of your program, keeping the function signature and return identical, but reducing the runtime by. - **Inlining helpers**: Removes function call overhead for `count_significant_digits`/`format_with_precision`. - **Avoiding `str(abs(num))`**: Uses math for significant digit checks (`if num >= 100`), since you only compare if it is >= 3 digits. - **Avoiding unnecessary assignments**: Avoids repeated string formatting expressions. - **Reduced code nesting**: Reduces the block levels for slightly improved branch prediction. - **No use of internal lists or expensive function calls in the critical path.** This should reduce the overall call count, remove slow string-conversion logic, and eliminate small value checks that are now obviously constant-time. ### Key improvements - No inner functions or per-call string allocations for digit counting. - Direct math comparison for integer digit checks. - Calls to string formatting are done only if needed with minimized branching. This version will be measurably faster per your line statistics, especially on the now inlined "digit count" and "format" logic.

codeflash/code_utils/time_utils.py

(`add-timing-info-to-generated-tests`) Here’s an optimized version of your program. The main bottleneck is not in the construction of `target_functions` (which is a tiny fraction of the runtime), but in the way you handle parsing and transformation with `libcst`. However, gathering `target_functions` can be optimized using a list comprehension with tuple unpacking, while avoiding multiple attribute lookups. Also, the main time is spent in `module.visit(transformer)` and `cst.parse_module`. If you have control over how the transformer (i.e., `AddDecoratorTransformer`) is written, you should make it as restrictive and fast as possible, using `visit_`/`leave_` functions that early-exit on non-target nodes. Below, I’ll only optimize what’s asked–rewriting this function to minimize unnecessary slow steps and any wasted computations, while preserving the code logic and interface. ### Changes. - Combined the logic of extracting `(class_name, function_name)` into a set comprehension for fewer attribute accesses and tighter bytecode. - Added a check: if `target_functions` is empty, we just return the original code immediately (this prevents any parsing/visiting if there's nothing to decorate). - Comments left untouched unless relevant code was modified. - Retained function signature and interface. - Provided some minor micro-optimizations (generator expressions, less branching). - Eliminated redundant variable assignments. ### Notes. - As per your profile, most time is spent in parsing and visiting. Optimize the `AddDecoratorTransformer` for early exits and to do as little as possible, in its own definition (not here), for further improvement. - Return early for the empty case: saves any parse/visit calls at all. - If you do ever get code objects, you could use `isinstance(module, cst.Module)` to skip reparsing, but as per the signature we always expect `str`. If you want even more speed, next steps are: - Minimize the work that `AddDecoratorTransformer` does (match by qualified name to avoid visiting subtrees needlessly). - Use native AST parsing/writing if you have full control over decorator syntax constraints. Let me know if you want to see transformer optimizations as well!

codeflash-ai · 2025-06-06T05:50:23Z

⚡️ Codeflash found optimizations for this PR

📄 17% (0.17x) speedup for `add_codeflash_decorator_to_code` in `codeflash/benchmarking/instrument_codeflash_trace.py`

⏱️ Runtime : 596 milliseconds → 510 milliseconds (best of 18 runs)

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function add_codeflash_decorator_to_code by 17% in PR #294 (add-timing-info-to-generated-tests) #297

If you approve, it will be merged into this PR (branch add-timing-info-to-generated-tests).

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

codeflash-ai · 2025-06-06T05:52:59Z

This PR is now faster! 🚀 Saurabh Misra accepted my code suggestion above.

misrasaurabh1 · 2025-06-06T05:53:39Z

codeflash just sped up vibe code claude code :)

first commit

e25a185

github-actions bot added the Review effort 3/5 label Jun 6, 2025

misrasaurabh1 added 4 commits June 5, 2025 19:51

Refactor

bdb5ca6

add missing test file

d0de40b

fix test

291ac2c

ruff reformat and fix linting

rename file name

b942040

misrasaurabh1 changed the title ~~Add the test case timing to the generated test~~ Add the test case timing to the generated test [CF-482] Jun 6, 2025

KRRT7 added 3 commits June 5, 2025 20:12

pre-commit run

bde6e06

mypy fixes

4fe83b6

more pre-commit fixes

e6272e8

misrasaurabh1 requested review from KRRT7 and aseembits93 June 6, 2025 03:31

codeflash-ai bot mentioned this pull request Jun 6, 2025

⚡️ Speed up function create_trace_replay_test_code by 50% in PR #294 (add-timing-info-to-generated-tests) #295

Closed

codeflash-ai bot reviewed Jun 6, 2025

View reviewed changes

codeflash/code_utils/time_utils.py Outdated Show resolved Hide resolved

Merge branch 'main' into add-timing-info-to-generated-tests

5115295

codeflash-ai bot mentioned this pull request Jun 6, 2025

⚡️ Speed up function add_codeflash_decorator_to_code by 17% in PR #294 (add-timing-info-to-generated-tests) #297

Closed

Update codeflash/code_utils/time_utils.py

2bffc7f

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

Merge branch 'main' into add-timing-info-to-generated-tests

84415b7

misrasaurabh1 merged commit 9be93b6 into main Jun 6, 2025
15 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add the test case timing to the generated test [CF-482] #294

Add the test case timing to the generated test [CF-482] #294

Uh oh!

misrasaurabh1 commented Jun 6, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

codeflash-ai bot commented Jun 6, 2025

⚡️ Speed up function `create_trace_replay_test_code` by 50% in PR #294 (`add-timing-info-to-generated-tests`) #295

Uh oh!

Uh oh!

codeflash-ai bot commented Jun 6, 2025

⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297

Uh oh!

codeflash-ai bot commented Jun 6, 2025

Uh oh!

misrasaurabh1 commented Jun 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add the test case timing to the generated test [CF-482] #294

Add the test case timing to the generated test [CF-482] #294

Uh oh!

Conversation

misrasaurabh1 commented Jun 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Changes walkthrough 📝

Uh oh!

github-actions bot commented Jun 6, 2025

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented Jun 6, 2025

PR Code Suggestions ✨

Uh oh!

codeflash-ai bot commented Jun 6, 2025

⚡️ Codeflash found optimizations for this PR

📄 50% (0.50x) speedup for create_trace_replay_test_code in codeflash/benchmarking/replay_test.py

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function create_trace_replay_test_code by 50% in PR #294 (add-timing-info-to-generated-tests) #295

Uh oh!

Uh oh!

codeflash-ai bot commented Jun 6, 2025

⚡️ Codeflash found optimizations for this PR

📄 17% (0.17x) speedup for add_codeflash_decorator_to_code in codeflash/benchmarking/instrument_codeflash_trace.py

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function add_codeflash_decorator_to_code by 17% in PR #294 (add-timing-info-to-generated-tests) #297

Uh oh!

codeflash-ai bot commented Jun 6, 2025

Uh oh!

misrasaurabh1 commented Jun 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

misrasaurabh1 commented Jun 6, 2025 •

edited by github-actions bot

Loading

📄 50% (0.50x) speedup for `create_trace_replay_test_code` in `codeflash/benchmarking/replay_test.py`

⚡️ Speed up function `create_trace_replay_test_code` by 50% in PR #294 (`add-timing-info-to-generated-tests`) #295

📄 17% (0.17x) speedup for `add_codeflash_decorator_to_code` in `codeflash/benchmarking/instrument_codeflash_trace.py`

⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297