Skip to content

Conversation

@dasarchan
Copy link
Contributor

Adds a call to cfapi to push hashes of the function code context in order to check if the function is being optimized again.

One thing to note here is that this checking logic happens early, in function discovery.

@dasarchan dasarchan requested a review from misrasaurabh1 June 3, 2025 18:03
@dasarchan dasarchan self-assigned this Jun 3, 2025


def check_optimization_status(
functions_by_file: dict[Path, list[FunctionToOptimize]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: What will be the scenario where the function is already an optimized code by CF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it would try to reoptimize it, actually - @misrasaurabh1 what's the desired behavior here

codeflash-ai bot added a commit that referenced this pull request Jun 5, 2025
…275 (`dont-optimize-repeatedly-gh-actions`)

Here is the optimized version of your program, focusing on speeding up the slow path in `make_cfapi_request`, which is dominated by `json.dumps(payload, indent=None, default=pydantic_encoder)` and the use of `requests.post(..., data=json_payload, ...)`. 

Key optimizations.

- **Use `requests.post(..., json=payload, ...)`:** This lets `requests` do the JSON serialization more efficiently (internally uses `json.dumps`). Furthermore, `requests` will add the `Content-Type: application/json` header if you use the `json` argument.
- **Only use the custom encoder if really needed:** Only pass `default=pydantic_encoder` if payload contains objects requiring it. If not, the standard encoder is much faster. You can try a direct serialization, and fallback if a `TypeError` is raised.
- **Avoid repeated `.upper()`** inside the POST/GET dispatch by normalizing early.
- **Avoid unnecessary string interpolation.**
- **Avoid updating headers dict when not needed.**
- **Other micro-optimizations:** Use local variables, merge dicts once, etc.
with all comments preserved and only modified/added where code changed.



**Explanation of biggest win:**  
The largest bottleneck was in JSON encoding and in manually setting the content-type header. Now, `requests.post(..., json=payload)` is used for the fastest path in the vast majority of requests, only falling back to a slower path if necessary. This should substantially speed up both serialization and POST.

This approach is backward-compatible and will produce exactly the same results as before.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Jun 5, 2025

⚡️ Codeflash found optimizations for this PR

📄 44% (0.44x) speedup for is_function_being_optimized_again in codeflash/api/cfapi.py

⏱️ Runtime : 2.79 milliseconds 1.94 milliseconds (best of 74 runs)

I created a new dependent PR with the suggested changes. Please review:

If you approve, it will be merged into this PR (branch dont-optimize-repeatedly-gh-actions).

codeflash-ai bot added a commit that referenced this pull request Jun 5, 2025
…in PR #275 (`dont-optimize-repeatedly-gh-actions`)

Here is an optimized version of your code, targeting the areas highlighted as slowest in your line profiling.

### Key Optimizations

1. **Read Only Necessary Lines:**
   - When `starting_line` and `ending_line` are provided, instead of reading the entire file and calling `.splitlines()`, read only the lines needed. This drastically lowers memory use and speeds up file operations for large files.
   - Uses `itertools.islice` to efficiently pluck only relevant lines.

2. **String Manipulation Reduction:**
   - Reduce the number of intermediate string allocations by reusing objects as much as possible and joining lines only once.
   - Avoids `strip()` unless absolutely necessary (as likely only for code content).

3. **Variable Lookup:**
   - Minimize attribute lookups that are inside loops.

The function semantics are preserved exactly. All comments are retained or improved for code that was changed for better understanding.



### Rationale

- The main bottleneck is reading full files and splitting them when only a small region is needed. By slicing only the relevant lines from file, the function becomes much faster for large files or high call counts.
- All behaviors, including fallback and hash calculation, are unchanged.
- Import of `islice` is local and lightweight.

**This should significantly improve both runtime and memory usage of `get_code_context_hash`.**
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Jun 5, 2025

⚡️ Codeflash found optimizations for this PR

📄 15% (0.15x) speedup for FunctionToOptimize.get_code_context_hash in codeflash/discovery/functions_to_optimize.py

⏱️ Runtime : 3.67 milliseconds 3.20 milliseconds (best of 72 runs)

I created a new dependent PR with the suggested changes. Please review:

If you approve, it will be merged into this PR (branch dont-optimize-repeatedly-gh-actions).

@openhands-ai
Copy link

openhands-ai bot commented Jun 5, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Lint
    • Mypy Type Checking for CLI
    • end-to-end-test
    • end-to-end-test

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #275

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@misrasaurabh1 misrasaurabh1 requested a review from a team June 8, 2025 08:36
)
for test_index, (test_path, test_perf_path) in enumerate(
zip(generated_test_paths, generated_perf_test_paths)
zip(generated_test_paths, generated_perf_test_paths, strict=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this for 3.9

@misrasaurabh1 misrasaurabh1 enabled auto-merge June 9, 2025 06:18
@misrasaurabh1 misrasaurabh1 merged commit a2e78e1 into main Jun 9, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants