Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ It uses advanced LLMs to generate multiple optimization ideas for your code, tes
How to use Codeflash -
- Optimize an entire existing codebase by running `codeflash --all`
- Automate optimizing all __future__ code you will write by installing Codeflash as a GitHub action.
- Optimize a Python workflow `python myscript.py` end-to-end by running `python -m codeflash.tracer -o benchmark.trace myscript.py`
- Optimize a Python workflow `python myscript.py` end-to-end by running `codeflash optimize myscript.py`

Codeflash is used by top engineering teams at [Pydantic](https://github.com/pydantic/pydantic/pulls?q=is%3Apr+author%3Amisrasaurabh1+is%3Amerged), [Langflow](https://github.com/langflow-ai/langflow/issues?q=state%3Aclosed%20is%3Apr%20author%3Amisrasaurabh1), [Albumentations](https://github.com/albumentations-team/albumentations/issues?q=state%3Amerged%20is%3Apr%20author%3Akrrt7%20OR%20state%3Amerged%20is%3Apr%20author%3Aaseembits93%20) and many others to ship performant, expert level code.
Codeflash is used by top engineering teams at [Pydantic](https://github.com/pydantic/pydantic/pulls?q=is%3Apr+author%3Amisrasaurabh1+is%3Amerged), [Langflow](https://github.com/langflow-ai/langflow/issues?q=state%3Aclosed%20is%3Apr%20author%3Amisrasaurabh1), [Roboflow](https://github.com/roboflow/inference/pulls?q=is%3Apr+is%3Amerged+codeflash+sort%3Acreated-asc), [Albumentations](https://github.com/albumentations-team/albumentations/issues?q=state%3Amerged%20is%3Apr%20author%3Akrrt7%20OR%20state%3Amerged%20is%3Apr%20author%3Aaseembits93%20) and many others to ship performant, expert level code.

Codeflash is great at optimizing AI Agents, Computer Vision algorithms, numerical code, backend code or anything else you might write with Python.
Codeflash is great at optimizing AI Agents, Computer Vision algorithms, PyTorch code, numerical code, backend code or anything else you might write with Python.


## Installation
Expand Down Expand Up @@ -50,6 +50,10 @@ Add codeflash as a development time dependency if you are using package managers
codeflash --all
```
This can take a while to run for a large codebase, but it will keep opening PRs as it finds optimizations.
3. Optimize a script:
```
codeflash optimize myscript.py
```

## Documentation
For detailed installation and usage instructions, visit our documentation at [docs.codeflash.ai](https://docs.codeflash.ai)
Expand Down
11 changes: 1 addition & 10 deletions docs/docs/getting-started/local-installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,6 @@ codeflash --all # optimize the entire repo
Make sure:
- ✅ Your virtual environment is activated
- ✅ All project dependencies are installed
- ✅ You're running `codeflash` from your project root

#### 🧪 "No optimizations found" or debugging issues
Use the `--verbose` flag for detailed output:
Expand All @@ -165,18 +164,10 @@ Verify:
- 🔍 Tests are discoverable by your test framework
- 📝 Test files follow naming conventions (`test_*.py` for pytest)

#### ⚙️ Configuration issues
Check your `pyproject.toml`:
```toml
[tool.codeflash]
module = "my_package"
test-framework = "pytest"
tests = "tests/"
```

### Next Steps

- Learn about [Codeflash Concepts](/codeflash-concepts/how-codeflash-works)
- Explore [optimization workflows](/optimizing-with-codeflash/one-function)
- Explore [Optimization workflows](/optimizing-with-codeflash/one-function)
- Set up [GitHub Actions integration](/getting-started/codeflash-github-actions)
- Read [configuration options](/configuration) for advanced setups
13 changes: 7 additions & 6 deletions docs/docs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,13 @@ This is a great way to ensure that your code, your team's code and your AI Agent

<!--- TODO: Add links to the relevant sections of the documentation and style the table --->

| Feature | Description |
|-----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [Optimize a single function](optimizing-with-codeflash/one-function) | Basic unit of optimization by asking Codeflash to optimize a particular function |
| [Optimize all code in a repo](optimizing-with-codeflash/codeflash-all) | Codeflash discovers all functions in a repo and optimizes all of them! |
| [Optimize every new pull request](optimizing-with-codeflash/optimize-prs) | Codeflash runs as a GitHub action and GitHub app and reviews all new code for Optimizations |
| [Optimize a whole workflow by Tracing it](optimizing-with-codeflash/trace-and-optimize) | End to end optimization for all the functions called in a workflow, by tracing to collect real inputs seen during execution and ensuring correctness and performance optimization with those inputs |
| Feature | Description |
|-----------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [Optimize a single function](optimizing-with-codeflash/one-function) | `codeflash --file path.py --function myfunctionpy` Basic unit of optimization by asking Codeflash to optimize a particular function
| [Optimize an entire workflow](optimizing-with-codeflash/trace-and-optimize) | `codeflash optimize myscript.py` End to end optimization for all the functions called in a workflow, by tracing to collect real inputs to ensure correctness and e2e performance optimization |
| [Optimize all code in a repo](optimizing-with-codeflash/codeflash-all) | `codeflash --all` Codeflash discovers all functions in a repo and optimizes all of them! |
| [Optimize every new pull request](optimizing-with-codeflash/optimize-prs) | `codeflash init-actions` Codeflash runs as a GitHub action and GitHub app and reviews all new code for Optimizations |


## How to use these docs

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/optimizing-with-codeflash/benchmarking.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
sidebar_position: 5
---
# Using Benchmarks
# Using Benchmarks in CI

Codeflash is able to determine the impact of an optimization on predefined benchmarks, when used in benchmark mode.

Expand Down
121 changes: 67 additions & 54 deletions docs/docs/optimizing-with-codeflash/trace-and-optimize.md
Original file line number Diff line number Diff line change
@@ -1,84 +1,97 @@
---
sidebar_position: 4
---
# Optimize Workflows End-to-End by Tracing them.
# Optimize Workflows End-to-End.

Codeflash supports optimizing an entire Python script end to end by tracing the execution of the script and generating Replay Tests. Tracing means following the execution of a script and capturing the inputs to all the functions called, so they can be replayed again while optimizing. These Replay Tests can be used to optimize all the functions called in the script.
Codeflash supports optimizing an entire Python script end-to-end by tracing the script's execution and generating Replay Tests. Tracing follows the execution of a script, profiles it and captures inputs to all called functions, allowing them to be replayed during optimization. Codeflash uses these Replay Tests to optimize all functions called in the script, starting from the most important ones.

## Motivation for Tracing a workflow
To optimize a script, `python myscript.py`, replace `python` with `codeflash optimize` and run the following command:

One of the hard problems with optimizing new code is verifying correctness and performance gains.
The way Codeflash verifies correctness and performance gains is by running the function under optimization against a set of test cases.
These test cases can be part of your existing unit test suite or generated by Codeflash. However, it's tedious to write test cases for every function you want to optimize, plus it's hard to come up with test cases for complex inputs and inputs that cover all edge cases.
Additionally, running the function with these test cases is not a great way to verify performance gains as the test cases might not be representative of the real-world usage of the function.
```bash
codeflash optimize myscript.py
```

To optimize code within pytest tests that you normally run like `python -m pytest tests/`, use this command:

Codeflash Tracer solves these issues.
```bash
codeflash optimize -m pytest tests/
```

## What is Codeflash Tracer?
Codeflash Tracer is a tool that traces the execution of your workflow and generates a set of test cases that are derived from how your code is actually run.
This powerful command creates high-quality optimizations, making it ideal when you need to optimize a workflow or script. The initial tracing process is slow, so try to limit your script's runtime to under 1 minute for best results. If your workflow is longer, consider breaking it into smaller sections and optimizing them separately with limited but representative data.

## What is the codeflash optimize command?

`codeflash optimize` does everything that an expert engineer would do while optimizing a workflow. It profiles your code, traces the execution of your workflow and generates a set of test cases that are derived from how your code is actually run.
Codeflash Tracer works by recording the inputs of your functions as they are called in your codebase. These inputs are then used to generate test cases that are representative of the real-world usage of your functions.
We call these generated test cases "Replay Tests" because they replay the inputs that were recorded during the tracing phase.

Then, Codeflash Optimizer can use these replay tests to verify correctness and calculate accurate performance gains for the optimized functions.
Using Replay Tests, Codeflash can verify that the optimized function produces the same output as the original function and also measure the performance gains of the optimized function on the real-world inputs.
This way you can be _sure_ that the optimized function causes no changes of behavior for the traced workflow and also, that it is faster than the original function.
Using Replay Tests, Codeflash can verify that the optimized functions produce the same output as the original function and also measure the performance gains of the optimized function on the real-world inputs.
This way you can be *sure* that the optimized function causes no changes of behavior for the traced workflow and also, that it is faster than the original function.

## Using Codeflash Tracer
## Using codeflash optimize

Codeflash Tracer can be used in two ways:
Codeflash Tracer can be used in three ways:

1. **As a command line module -**
1. **As an integrated command**

You can use Codeflash Tracer as a module when you run Python.
If you run a Python script as follows -
If you run a Python script as follows

```bash
python path/to/your/file.py --your_options
```
You can trace the execution of the script by running -

You can start tracing and optimizing your code with the following command

```bash
python -m codeflash.tracer -o codeflash.trace path/to/your/file.py --your_options
codeflash optimize path/to/your/file.py --your_options
```

So adding a `-m codeflash.tracer -o codeflash.trace` before your script will trace the execution of the script and save the trace to a file called `codeflash.trace`.
If your script itself runs as a module, you can run it as follows -
The above command should suffice in most situations. You can add a argument like `codeflash optimize -o trace_file_path.trace` if you want to customize the trace file location. Otherwise, it defaults to `codeflash.trace` in the cwd.

2. **Trace and optimize as two separate steps**

If you want more control over the tracing and optimization process. You can trace first and then optimize with the replay tests later. Each replay test is associated with a trace file.

To first create just the trace file, run

```python
codeflash optimize -o trace_file.trace --trace-only path/to/your/file.py --your_options
```

This will create a replay test file. To optimize with the replay test, run the

```bash
python -m codeflash.tracer -o codeflash.trace -m path.to.your.module --your_options
codeflash --replay-test /path/to/test_replay_test_0.py
```
More Options:
- `--max-function-count`: The maximum number of times to trace a single function. More calls to a function will not be traced. Default is 100.
More Options:
- `--tracer-timeout`: The maximum time in seconds to trace the entire workflow. Default is indefinite. This is useful while tracing really long workflows.
2. **As a Context Manager -**

You can also use Codeflash Tracer as a context manager in your codebase.
You can wrap the code you want to trace in a `with` statement as follows -
```python
from codeflash.tracer import Tracer

with Tracer(output="codeflash.trace"):
# Your code here
```
This is useful to only trace and optimize a part of your executable, not the entire script.
Sometimes, if using the tracer as a module fails, then the Context Manager can also be used to trace the code sections.

More Options:
3. **As a Context Manager -**

To trace only very specific sections of your codeflash, You can also use the Codeflash Tracer as a context manager.
You can wrap the code you want to trace in a `with` statement as follows -

```python
from codeflash.tracer import Tracer

with Tracer(output="codeflash.trace"):
# Your code here
```

Sometimes, if using the tracer as a module fails, then the Context Manager can also be used to trace the code sections. This also is much faster than tracing the whole script.

After this finishes, you can optimize using the replay tests

```bash
codeflash --replay-test /path/to/test_replay_test_0.py
```

More Options to the Tracer:

- `disable`: If set to `True`, the tracer will not trace the code. Default is `False`.
- `max_function_count`: The maximum number of times to trace a single function. More calls to a function will not be traced. Default is 100.
- `timeout`: The maximum time in seconds to trace the entire workflow. Default is indefinite. This is useful while tracing really long workflows, to not wait indefinitely.
- `output`: The file to save the trace to. Default is `codeflash.trace`.
- `config_file_path`: The path to the `pyproject.toml` file which stores the Codeflash config. This is auto-discovered by default.
You can also disable the tracer in the code by setting the `disable=True` option in the `Tracer` constructor.

## Optimizing with Replay Tests
After the tracing phase is complete, the tracer will generate a trace file as well as a Replay Test file.
The path of the generated replay test is printed on the console after the tracing is complete. It will be located in your tests directory and have a name like `test_file_getting_traced__replay_test_0.py`.
The Replay Test file is a Python test file that when run will call the traced functions with the recorded inputs, i.e. replay them.
Now Codeflash Optimizer can use these Replay Tests to verify correctness and calculate performance gains of the optimized functions.

To optimize all the functions traced, you can run the following command -
```bash
codeflash --replay-test tests/test_file_getting_traced__replay_test_0.py
```
Codeflash will auto-discover all the functions that were traced, and use the replay tests, plus will discover existing unit tests and generate more tests to get the best optimizations.
Codeflash will open pull requests with the optimized functions as it finds them, which should speed up your end to end workflow!


You can also disable the tracer in the code by setting the `disable=True` option in the `Tracer` constructor.
Loading