Benchmark Fixture fixes #586

KRRT7 · 2025-07-25T21:13:59Z

User description

Update plugin.py

PR Type

Enhancement, Tests, Bug fix, Other

Description

Add codeflash-benchmark plugin packaging and versioning
Enhance pytest hooks and benchmark fixture fallback logic
Generate unique replay test names and update test expectations
Support aliased import detection in unit test discovery

File Walkthrough

	Relevant files

github-actions · 2025-07-25T21:15:14Z

PR Reviewer Guide 🔍

(Review updated until commit `41d314d`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Missing Import Usage of importlib.util.find_spec without importing importlib is likely to cause a NameError. PYTEST_BENCHMARK_INSTALLED = importlib.util.find_spec("pytest_benchmark") is not None Missing Attribute Reference to `codeflash_benchmark_plugin.project_root` may fail if `project_root` is not defined on the plugin instance. Path(str(node_path)), Path(codeflash_benchmark_plugin.project_root), traverse_up=True Undefined Variable The f-string uses `{self}` in the test function definition, but `self` is undefined in this scope and will cause a NameError. unique_test_name = get_unique_test_name(module_name, function_name, benchmark_function_name, class_name) test_template += f"def test_{unique_test_name}({self}):\n{formatted_test_body}\n"

github-actions · 2025-07-25T21:16:12Z

PR Code Suggestions ✨

Latest suggestions up to 41d314d

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix undefined test parameters Replace the undefined `{self}` placeholder with a concrete parameter list based on the test framework so that pytest functions have no parameters and unittest methods include `self`. Initialize a variable for the parameter list before constructing the signature and use it in the f-string. codeflash/benchmarking/replay_test.py [226] -test_template += f"def test_{unique_test_name}({self}):\n{formatted_test_body}\n" +params = "self" if test_framework == "unittest" else "" +test_template += f"def test_{unique_test_name}({params}):\n{formatted_test_body}\n" Suggestion importance[1-10]: 8 __ Why: The use of an undefined `{self}` in the generated test signature will break test generation, and this fix provides a correct parameter list based on `test_framework`.	Medium
General	Support import alias mapping Also capture aliases from top‐level `import` statements so that references like `import` `module as m` are recognized during discovery. Implement a `visit_Import` method to map `alias.asname` to the original module name. codeflash/discovery/discover_unit_tests.py [211-213] -def visit_ImportFrom(self, node: ast.ImportFrom) -> None: - ... - if alias.asname: - self.alias_mapping[imported_name] = aname +def visit_Import(self, node: ast.Import) -> None: + for alias in node.names: + original = alias.name + alias_name = alias.asname or alias.name + self.imported_modules.add(alias_name) + if alias.asname: + self.alias_mapping[alias_name] = original Suggestion importance[1-10]: 6 __ Why: Adding a `visit_Import` method captures alias mappings for top‐level `import` statements, making the test discovery more robust though it’s not critical.	Low
General	Log pytest stdout too Include the captured standard output in the debug logs to provide visibility into successful test runs or collection errors, not just stderr. codeflash/benchmarking/trace_benchmarks.py [36-37] logger.info(f"ran with args: {result.args}") -logger.info(f"Pytest output: {result.stderr}") +logger.info(f"Pytest stdout: {result.stdout}") +logger.info(f"Pytest stderr: {result.stderr}") Suggestion importance[1-10]: 4 __ Why: Logging `stdout` provides additional debugging information but is a minor enhancement.	Low

Previous suggestions

Suggestions up to commit d3a0035

Category	Suggestion	Impact
General	Remove unnecessary sleep Remove the arbitrary `sleep(0.0001)` delay since it unnecessarily slows down processing of test files. If a pause is truly required, document its purpose or make it configurable. codeflash/discovery/discover_unit_tests.py [524] -sleep(0.0001) +# Removed artificial sleep to speed up processing Suggestion importance[1-10]: 5 __ Why: The `sleep(0.0001)` delay is arbitrary and adds unnecessary latency without clear benefit, so removing it improves performance.	Low
Possible issue	Always return BenchmarkFixture Ensure that when pytest-benchmark is installed you always return a `BenchmarkFixture` to avoid silently falling back to a no-op lambda. Remove the `if bs:` guard and supply default values when `bs` is `None`, so the fixture is consistently returned. codeflash/benchmarking/plugin/plugin.py [314-333] if PYTEST_BENCHMARK_INSTALLED: from pytest_benchmark.fixture import BenchmarkFixture as BSF # noqa: N814 bs = getattr(config, "_benchmarksession", None) if bs and bs.skip: pytest.skip("Benchmarks are skipped (--benchmark-skip was used).") node = request.node marker = node.get_closest_marker("benchmark") options = dict(marker.kwargs) if marker else {} - if bs: - return BSF( - node, - add_stats=bs.benchmarks.append, - logger=bs.logger, - warner=request.node.warn, - disabled=bs.disabled, - dict(bs.options, options), - ) + return BSF( + node, + add_stats=bs.benchmarks.append if bs else None, + logger=bs.logger if bs else None, + warner=request.node.warn, + disabled=bs.disabled if bs else False, + dict(getattr(bs, "options", {}), options), + ) Suggestion importance[1-10]: 4 __ Why: The change ensures a `BenchmarkFixture` is always returned, but overriding defaults when `bs` is `None` may lead to runtime errors due to missing session attributes.	Low

Suggestions up to commit ad90c2d

Category	Suggestion	Impact
Possible issue	Use local Benchmark reference The fixture references `codeflash_benchmark_plugin` which isn't imported or defined in this file. Use the locally defined `Benchmark` class directly instead of the undefined module prefix. codeflash/benchmarking/plugin/plugin.py [315] -return codeflash_benchmark_plugin.Benchmark(request) +return Benchmark(request) Suggestion importance[1-10]: 10 __ Why: The reference to `codeflash_benchmark_plugin` is undefined causing a NameError, so calling the locally defined `Benchmark` class directly is necessary.	High

KRRT7 · 2025-07-25T21:38:36Z

github-actions · 2025-07-25T21:40:06Z

Persistent review updated to latest commit d3a0035

…(`benchmark-fixture-fix`) Here is an optimized version, **rewritten for maximum runtime performance**. - **Major hotspots** (from the profiler) are: - `get_function_alias` and `get_unique_test_name` – called thousands of times, simple string ops. - Many calls to `get_function_alias` **duplicate input** in many places. - Many repeated lookups (`func.get("key")`). - `textwrap.indent` and especially `textwrap.dedent` called many times with the same strings. - **Optimization strategies:** - **Cache** results of `get_function_alias` and `get_unique_test_name` per argument tuple (using `functools.lru_cache`). - **Pre-dedent** test templates once at function scope, not in loop. - **Minimize string concatenation**, and loop variable lookups. - Use **list building** + `''.join()` for string results where appropriate. - Avoid repeated `str.format`/f-string reinterpretation in tight loops: compose test bodies fully with known variables, not dynamic format. - Inline/copy critical logic to avoid redundant function calls per-loop where possible (without changing output). - Inline simple get-alias pattern for unique test name. - **Batch key access** from `func` via local variables, and give fast locals in all loops. --- --- **Key optimizations explained:** - `@lru_cache` for `get_function_alias` and `get_unique_test_name`: avoids recomputation for repeated arguments, which are very frequent. - Templates are dedented **once** at module load, dramatically cutting cost of repeated dedentation per test case. - All `func.get("key")` inside tight loops are replaced by direct `func["key"]` to avoid re-parsing and make local lookups faster. - String concatenations are gathered in lists and joined at the end for efficiency (instead of `+=`). - All template string field subs are consolidated to minimize calls to slow formatting/f-strings inside loops. - Minimizes repeated lookups and attribute access by using local variables for everything repeatedly accessed in loops. This should result in very significant **runtime reduction** for large numbers of test generation, as all major hotspots are addressed. Output remains identical to the original.

codeflash-ai · 2025-07-28T06:14:48Z

⚡️ Codeflash found optimizations for this PR

📄 36% (0.36x) speedup for `create_trace_replay_test_code` in `codeflash/benchmarking/replay_test.py`

⏱️ Runtime : 10.5 milliseconds → 7.71 milliseconds (best of 325 runs)

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function create_trace_replay_test_code by 36% in PR #586 (benchmark-fixture-fix) #588

If you approve, it will be merged into this PR (branch benchmark-fixture-fix).

github-actions · 2025-07-28T23:48:05Z

Persistent review updated to latest commit 41d314d

misrasaurabh1

LGTM

misrasaurabh1

LGTM

github-actions bot added the Review effort 2/5 label Jul 25, 2025

KRRT7 marked this pull request as ready for review July 25, 2025 21:38

KRRT7 requested a review from misrasaurabh1 July 25, 2025 21:38

github-actions bot added Review effort 3/5 and removed Review effort 2/5 labels Jul 25, 2025

codeflash-ai bot mentioned this pull request Jul 28, 2025

⚡️ Speed up function create_trace_replay_test_code by 36% in PR #586 (benchmark-fixture-fix) #588

Closed

KRRT7 closed this Jul 28, 2025

KRRT7 force-pushed the benchmark-fixture-fix branch from 78fa5b4 to b478f10 Compare July 28, 2025 23:45

cleanup benchmark

41d314d

KRRT7 reopened this Jul 28, 2025

KRRT7 added 3 commits July 28, 2025 17:04

remove debug prints

4bddc1e

da rulez

bd3d331

Update pyproject.toml

974b891

misrasaurabh1 approved these changes Jul 29, 2025

View reviewed changes

KRRT7 merged commit b2c2743 into main Jul 29, 2025
17 checks passed

KRRT7 deleted the benchmark-fixture-fix branch July 29, 2025 01:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark Fixture fixes #586

Benchmark Fixture fixes #586

Uh oh!

KRRT7 commented Jul 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 25, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 25, 2025 •

edited

Loading

Uh oh!

KRRT7 commented Jul 25, 2025

Uh oh!

github-actions bot commented Jul 25, 2025

Uh oh!

codeflash-ai bot commented Jul 28, 2025

⚡️ Speed up function `create_trace_replay_test_code` by 36% in PR #586 (`benchmark-fixture-fix`) #588

Uh oh!

github-actions bot commented Jul 28, 2025

Uh oh!

misrasaurabh1 left a comment

Uh oh!

misrasaurabh1 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Benchmark Fixture fixes #586

Benchmark Fixture fixes #586

Uh oh!

Conversation

KRRT7 commented Jul 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

File Walkthrough

Uh oh!

github-actions bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 41d314d)

Uh oh!

github-actions bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

KRRT7 commented Jul 25, 2025

Uh oh!

github-actions bot commented Jul 25, 2025

Uh oh!

codeflash-ai bot commented Jul 28, 2025

⚡️ Codeflash found optimizations for this PR

📄 36% (0.36x) speedup for create_trace_replay_test_code in codeflash/benchmarking/replay_test.py

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up function create_trace_replay_test_code by 36% in PR #586 (benchmark-fixture-fix) #588

Uh oh!

github-actions bot commented Jul 28, 2025

Uh oh!

misrasaurabh1 left a comment

Choose a reason for hiding this comment

Uh oh!

misrasaurabh1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KRRT7 commented Jul 25, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Jul 25, 2025 •

edited

Loading

(Review updated until commit `41d314d`)

github-actions bot commented Jul 25, 2025 •

edited

Loading

📄 36% (0.36x) speedup for `create_trace_replay_test_code` in `codeflash/benchmarking/replay_test.py`

⚡️ Speed up function `create_trace_replay_test_code` by 36% in PR #586 (`benchmark-fixture-fix`) #588