Skip to content

Commit 51a9351

Browse files
authored
Merge branch 'main' into main
2 parents 8ca1806 + 681fa17 commit 51a9351

File tree

10 files changed

+559
-291
lines changed

10 files changed

+559
-291
lines changed

WARP.md

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,191 @@
1+
# WARP.md
2+
3+
This file provides guidance to WARP (warp.dev) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Codeflash is a general-purpose optimizer for Python that helps improve code performance while maintaining correctness. It uses advanced LLMs to generate optimization ideas, tests them for correctness, and benchmarks them for performance, then creates merge-ready pull requests.
8+
9+
## Development Environment Setup
10+
11+
### Prerequisites
12+
- Python 3.9+ (project uses uv for dependency management)
13+
- Git (for version control and PR creation)
14+
- Codeflash API key (for AI services)
15+
16+
### Initial Setup
17+
```bash
18+
# Install dependencies using uv (preferred over pip)
19+
uv sync
20+
21+
# Initialize codeflash configuration
22+
uv run codeflash init
23+
```
24+
25+
## Core Development Commands
26+
27+
### Code Quality & Linting
28+
```bash
29+
# Format code with ruff (includes check and format)
30+
uv run ruff check --fix codeflash/
31+
uv run ruff format codeflash/
32+
33+
# Type checking with mypy
34+
uv run mypy codeflash/
35+
36+
# Pre-commit hooks (ruff check + format)
37+
uv run pre-commit run --all-files
38+
```
39+
40+
### Testing
41+
```bash
42+
# Run all tests
43+
uv run pytest
44+
45+
# Run specific test file
46+
uv run pytest tests/test_specific_file.py
47+
48+
# Run tests matching pattern
49+
uv run pytest -k "pattern"
50+
51+
```
52+
53+
### Running Codeflash
54+
```bash
55+
# Optimize entire codebase
56+
uv run codeflash --all
57+
58+
# Optimize specific file
59+
uv run codeflash --file path/to/file.py
60+
61+
# Optimize specific function
62+
uv run codeflash --function "module.function"
63+
64+
# Optimize a script end-to-end
65+
uv run codeflash optimize script.py
66+
67+
# Run with benchmarking
68+
uv run codeflash --benchmark
69+
70+
# Verify setup
71+
uv run codeflash --verify-setup
72+
```
73+
74+
## Architecture Overview
75+
76+
### Main Components
77+
78+
**Core Modules:**
79+
- `codeflash/main.py` - CLI entry point and command coordination
80+
- `codeflash/cli_cmds/` - Command-line interface implementations
81+
- `codeflash/optimization/` - Core optimization engine and algorithms
82+
- `codeflash/verification/` - Code correctness verification
83+
- `codeflash/benchmarking/` - Performance measurement and comparison
84+
- `codeflash/discovery/` - Code analysis and function discovery
85+
- `codeflash/tracing/` - Runtime tracing and profiling
86+
- `codeflash/context/` - Code context extraction and analysis
87+
- `codeflash/result/` - Result processing, PR creation, and explanations
88+
89+
**Supporting Systems:**
90+
- `codeflash/api/` - Backend API communication
91+
- `codeflash/github/` - GitHub integration for PR creation
92+
- `codeflash/models/` - Data models and schemas
93+
- `codeflash/telemetry/` - Analytics and error reporting
94+
- `codeflash/code_utils/` - Code parsing, formatting, and manipulation utilities
95+
96+
### Key Workflows
97+
98+
1. **Code Discovery**: Analyzes codebase to identify optimization candidates
99+
2. **Context Extraction**: Extracts relevant code context and dependencies
100+
3. **Optimization Generation**: Uses LLMs to generate optimization candidates
101+
4. **Verification**: Tests optimizations for correctness using existing tests
102+
5. **Benchmarking**: Measures performance improvements
103+
6. **Result Processing**: Creates explanations and pull requests
104+
105+
### Configuration
106+
107+
Configuration is stored in `pyproject.toml` under `[tool.codeflash]`:
108+
- `module-root` - Source code location (default: "codeflash")
109+
- `tests-root` - Test location (default: "tests")
110+
- `benchmarks-root` - Benchmark location (default: "tests/benchmarks")
111+
- `test-framework` - Testing framework ("pytest" or "unittest")
112+
- `formatter-cmds` - Commands for code formatting
113+
114+
## Project Structure
115+
116+
```
117+
codeflash/
118+
├── api/ # Backend API communication
119+
├── benchmarking/ # Performance measurement
120+
├── cli_cmds/ # CLI command implementations
121+
├── code_utils/ # Code analysis and manipulation
122+
├── context/ # Code context extraction
123+
├── discovery/ # Function and test discovery
124+
├── github/ # GitHub API integration
125+
├── lsp/ # Language server protocol support
126+
├── models/ # Data models and schemas
127+
├── optimization/ # Core optimization engine
128+
├── result/ # Result processing and PR creation
129+
├── telemetry/ # Analytics and monitoring
130+
├── tracing/ # Runtime tracing and profiling
131+
├── verification/ # Correctness verification
132+
└── main.py # CLI entry point
133+
134+
tests/ # Test suite
135+
├── benchmarks/ # Performance benchmarks
136+
└── scripts/ # Test utilities
137+
138+
docs/ # Documentation
139+
code_to_optimize/ # Example code for optimization
140+
codeflash-benchmark/ # Benchmark workspace member
141+
```
142+
143+
## Development Notes
144+
145+
### Code Style
146+
- Uses ruff for linting and formatting (configured in pyproject.toml)
147+
- Strict mypy type checking enabled
148+
- Pre-commit hooks enforce code quality
149+
150+
### Testing
151+
- pytest-based test suite with extensive coverage
152+
- Parameterized tests for multiple scenarios
153+
- Benchmarking tests for performance validation
154+
- Test discovery supports both pytest and unittest frameworks
155+
156+
### Workspace Structure
157+
- Uses uv workspace with `codeflash-benchmark` as a member
158+
- Dependencies managed through uv.lock
159+
- Dynamic versioning from git tags using uv-dynamic-versioning
160+
161+
### Build & Distribution
162+
- Uses hatchling as build backend
163+
- BSL-1.1 license
164+
- Excludes development files from distribution packages
165+
166+
### CI/CD Integration
167+
- GitHub Actions workflow for automatic optimization of PR code
168+
- Pre-commit hooks for code quality enforcement
169+
- Automated testing and benchmarking
170+
171+
## Important Patterns
172+
173+
### Error Handling
174+
- Uses `either.py` for functional error handling patterns
175+
- Comprehensive error tracking through Sentry integration
176+
- Graceful degradation when AI services are unavailable
177+
178+
### Instrumentation
179+
- Extensive tracing capabilities for performance analysis
180+
- Line profiler integration for detailed performance metrics
181+
- Custom tracer implementation for code execution analysis
182+
183+
### AI Integration
184+
- Structured prompts and response handling for LLM interactions
185+
- Critic module for evaluating optimization quality
186+
- Context-aware code generation and explanation
187+
188+
### Git Integration
189+
- GitPython for repository operations
190+
- Automated PR creation with detailed explanations
191+
- Branch management for optimization experiments

codeflash-benchmark/codeflash_benchmark/plugin.py

Lines changed: 38 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,54 @@
88

99
PYTEST_BENCHMARK_INSTALLED = importlib.util.find_spec("pytest_benchmark") is not None
1010

11+
benchmark_options = [
12+
("--benchmark-columns", "store", None, "Benchmark columns"),
13+
("--benchmark-group-by", "store", None, "Benchmark group by"),
14+
("--benchmark-name", "store", None, "Benchmark name pattern"),
15+
("--benchmark-sort", "store", None, "Benchmark sort column"),
16+
("--benchmark-json", "store", None, "Benchmark JSON output file"),
17+
("--benchmark-save", "store", None, "Benchmark save name"),
18+
("--benchmark-warmup", "store", None, "Benchmark warmup"),
19+
("--benchmark-warmup-iterations", "store", None, "Benchmark warmup iterations"),
20+
("--benchmark-min-time", "store", None, "Benchmark minimum time"),
21+
("--benchmark-max-time", "store", None, "Benchmark maximum time"),
22+
("--benchmark-min-rounds", "store", None, "Benchmark minimum rounds"),
23+
("--benchmark-timer", "store", None, "Benchmark timer"),
24+
("--benchmark-calibration-precision", "store", None, "Benchmark calibration precision"),
25+
("--benchmark-disable", "store_true", False, "Disable benchmarks"),
26+
("--benchmark-skip", "store_true", False, "Skip benchmarks"),
27+
("--benchmark-only", "store_true", False, "Only run benchmarks"),
28+
("--benchmark-verbose", "store_true", False, "Verbose benchmark output"),
29+
("--benchmark-histogram", "store", None, "Benchmark histogram"),
30+
("--benchmark-compare", "store", None, "Benchmark compare"),
31+
("--benchmark-compare-fail", "store", None, "Benchmark compare fail threshold"),
32+
]
33+
1134

1235
def pytest_configure(config: pytest.Config) -> None:
1336
"""Register the benchmark marker and disable conflicting plugins."""
1437
config.addinivalue_line("markers", "benchmark: mark test as a benchmark that should be run with codeflash tracing")
1538

16-
if config.getoption("--codeflash-trace") and PYTEST_BENCHMARK_INSTALLED:
17-
config.option.benchmark_disable = True
18-
config.pluginmanager.set_blocked("pytest_benchmark")
19-
config.pluginmanager.set_blocked("pytest-benchmark")
39+
if config.getoption("--codeflash-trace"):
40+
# When --codeflash-trace is used, ignore all benchmark options by resetting them to defaults
41+
for option, _, default, _ in benchmark_options:
42+
option_name = option.replace("--", "").replace("-", "_")
43+
if hasattr(config.option, option_name):
44+
setattr(config.option, option_name, default)
45+
46+
if PYTEST_BENCHMARK_INSTALLED:
47+
config.pluginmanager.set_blocked("pytest_benchmark")
48+
config.pluginmanager.set_blocked("pytest-benchmark")
2049

2150

2251
def pytest_addoption(parser: pytest.Parser) -> None:
2352
parser.addoption(
2453
"--codeflash-trace", action="store_true", default=False, help="Enable CodeFlash tracing for benchmarks"
2554
)
55+
# These options are ignored when --codeflash-trace is used
56+
for option, action, default, help_text in benchmark_options:
57+
help_suffix = " (ignored when --codeflash-trace is used)"
58+
parser.addoption(option, action=action, default=default, help=help_text + help_suffix)
2659

2760

2861
@pytest.fixture
@@ -37,7 +70,7 @@ def benchmark(request: pytest.FixtureRequest) -> object:
3770
# If pytest-benchmark is installed and --codeflash-trace is not enabled,
3871
# return the normal pytest-benchmark fixture
3972
if PYTEST_BENCHMARK_INSTALLED:
40-
from pytest_benchmark.fixture import BenchmarkFixture as BSF # noqa: N814
73+
from pytest_benchmark.fixture import BenchmarkFixture as BSF # pyright: ignore[reportMissingImports] # noqa: I001, N814
4174

4275
bs = getattr(config, "_benchmarksession", None)
4376
if bs and bs.skip:

codeflash-benchmark/pyproject.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "codeflash-benchmark"
3-
version = "0.1.0"
3+
version = "0.2.0"
44
description = "Pytest benchmarking plugin for codeflash.ai - automatic code performance optimization"
55
authors = [{ name = "CodeFlash Inc.", email = "[email protected]" }]
66
requires-python = ">=3.9"
@@ -25,8 +25,8 @@ Repository = "https://github.com/codeflash-ai/codeflash-benchmark"
2525
codeflash-benchmark = "codeflash_benchmark.plugin"
2626

2727
[build-system]
28-
requires = ["setuptools>=45", "wheel", "setuptools_scm"]
28+
requires = ["setuptools>=45", "wheel"]
2929
build-backend = "setuptools.build_meta"
3030

3131
[tool.setuptools]
32-
packages = ["codeflash_benchmark"]
32+
packages = ["codeflash_benchmark"]

codeflash/api/aiservice.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -360,6 +360,7 @@ def log_results( # noqa: D417
360360
is_correct: dict[str, bool] | None,
361361
optimized_line_profiler_results: dict[str, str] | None,
362362
metadata: dict[str, Any] | None,
363+
optimizations_post: dict[str, str] | None = None,
363364
) -> None:
364365
"""Log features to the database.
365366
@@ -372,6 +373,7 @@ def log_results( # noqa: D417
372373
- is_correct (Optional[Dict[str, bool]]): Whether the optimized code is correct.
373374
- optimized_line_profiler_results: line_profiler results for every candidate mapped to their optimization_id
374375
- metadata: contains the best optimization id
376+
- optimizations_post - dict mapping opt id to code str after postprocessing
375377
376378
"""
377379
payload = {
@@ -383,6 +385,7 @@ def log_results( # noqa: D417
383385
"codeflash_version": codeflash_version,
384386
"optimized_line_profiler_results": optimized_line_profiler_results,
385387
"metadata": metadata,
388+
"optimizations_post": optimizations_post,
386389
}
387390
try:
388391
self.make_ai_service_request("/log_features", payload=payload, timeout=5)

codeflash/cli_cmds/cmd_init.py

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -85,12 +85,16 @@ def init_codeflash() -> None:
8585

8686
did_add_new_key = prompt_api_key()
8787

88-
if should_modify_pyproject_toml():
89-
setup_info: SetupInfo = collect_setup_info()
88+
should_modify, config = should_modify_pyproject_toml()
89+
90+
git_remote = config.get("git_remote", "origin") if config else "origin"
9091

92+
if should_modify:
93+
setup_info: SetupInfo = collect_setup_info()
94+
git_remote = setup_info.git_remote
9195
configure_pyproject_toml(setup_info)
9296

93-
install_github_app()
97+
install_github_app(git_remote)
9498

9599
install_github_actions(override_formatter_check=True)
96100

@@ -151,7 +155,7 @@ def ask_run_end_to_end_test(args: Namespace) -> None:
151155
run_end_to_end_test(args, bubble_sort_path, bubble_sort_test_path)
152156

153157

154-
def should_modify_pyproject_toml() -> bool:
158+
def should_modify_pyproject_toml() -> tuple[bool, dict[str, Any] | None]:
155159
"""Check if the current directory contains a valid pyproject.toml file with codeflash config.
156160
157161
If it does, ask the user if they want to re-configure it.
@@ -160,22 +164,22 @@ def should_modify_pyproject_toml() -> bool:
160164

161165
pyproject_toml_path = Path.cwd() / "pyproject.toml"
162166
if not pyproject_toml_path.exists():
163-
return True
167+
return True, None
164168
try:
165169
config, config_file_path = parse_config_file(pyproject_toml_path)
166170
except Exception:
167-
return True
171+
return True, None
168172

169173
if "module_root" not in config or config["module_root"] is None or not Path(config["module_root"]).is_dir():
170-
return True
174+
return True, None
171175
if "tests_root" not in config or config["tests_root"] is None or not Path(config["tests_root"]).is_dir():
172-
return True
176+
return True, None
173177

174178
return Confirm.ask(
175179
"✅ A valid Codeflash config already exists in this project. Do you want to re-configure it?",
176180
default=False,
177181
show_default=True,
178-
)
182+
), config
179183

180184

181185
# Custom theme for better UX
@@ -958,16 +962,18 @@ def configure_pyproject_toml(setup_info: SetupInfo) -> None:
958962
click.echo()
959963

960964

961-
def install_github_app() -> None:
965+
def install_github_app(git_remote: str) -> None:
962966
try:
963967
git_repo = git.Repo(search_parent_directories=True)
964968
except git.InvalidGitRepositoryError:
965969
click.echo("Skipping GitHub app installation because you're not in a git repository.")
966970
return
967-
owner, repo = get_repo_owner_and_name(git_repo)
971+
owner, repo = get_repo_owner_and_name(git_repo, git_remote)
968972

969973
if is_github_app_installed_on_repo(owner, repo, suppress_errors=True):
970-
click.echo("🐙 Looks like you've already installed the Codeflash GitHub app on this repository! Continuing…")
974+
click.echo(
975+
f"🐙 Looks like you've already installed the Codeflash GitHub app on this repository ({owner}/{repo})! Continuing…"
976+
)
971977

972978
else:
973979
click.prompt(

codeflash/discovery/pytest_new_process_discovery.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,8 @@ def parse_pytest_collection_results(pytest_tests: list[Any]) -> list[dict[str, s
4141

4242
try:
4343
exitcode = pytest.main(
44-
[tests_root, "-p no:logging", "--collect-only", "-m", "not skip"], plugins=[PytestCollectionPlugin()]
44+
[tests_root, "-p no:logging", "--collect-only", "-m", "not skip", "-p", "no:codeflash-benchmark"],
45+
plugins=[PytestCollectionPlugin()],
4546
)
4647
except Exception as e:
4748
print(f"Failed to collect tests: {e!s}")

0 commit comments

Comments
 (0)