- 
                Notifications
    
You must be signed in to change notification settings  - Fork 22
 
Don't repeatedly optimize gh actions #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
               | 
          ||
| 
               | 
          ||
| def check_optimization_status( | ||
| functions_by_file: dict[Path, list[FunctionToOptimize]], | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: What will be the scenario where the function is already an optimized code by CF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now it would try to reoptimize it, actually - @misrasaurabh1 what's the desired behavior here
…com/codeflash-ai/codeflash into dont-optimize-repeatedly-gh-actions
…275 (`dont-optimize-repeatedly-gh-actions`) Here is the optimized version of your program, focusing on speeding up the slow path in `make_cfapi_request`, which is dominated by `json.dumps(payload, indent=None, default=pydantic_encoder)` and the use of `requests.post(..., data=json_payload, ...)`. Key optimizations. - **Use `requests.post(..., json=payload, ...)`:** This lets `requests` do the JSON serialization more efficiently (internally uses `json.dumps`). Furthermore, `requests` will add the `Content-Type: application/json` header if you use the `json` argument. - **Only use the custom encoder if really needed:** Only pass `default=pydantic_encoder` if payload contains objects requiring it. If not, the standard encoder is much faster. You can try a direct serialization, and fallback if a `TypeError` is raised. - **Avoid repeated `.upper()`** inside the POST/GET dispatch by normalizing early. - **Avoid unnecessary string interpolation.** - **Avoid updating headers dict when not needed.** - **Other micro-optimizations:** Use local variables, merge dicts once, etc. with all comments preserved and only modified/added where code changed. **Explanation of biggest win:** The largest bottleneck was in JSON encoding and in manually setting the content-type header. Now, `requests.post(..., json=payload)` is used for the fastest path in the vast majority of requests, only falling back to a slower path if necessary. This should substantially speed up both serialization and POST. This approach is backward-compatible and will produce exactly the same results as before.
          ⚡️ Codeflash found optimizations for this PR📄 44% (0.44x) speedup for 
 | 
    
…in PR #275 (`dont-optimize-repeatedly-gh-actions`) Here is an optimized version of your code, targeting the areas highlighted as slowest in your line profiling. ### Key Optimizations 1. **Read Only Necessary Lines:** - When `starting_line` and `ending_line` are provided, instead of reading the entire file and calling `.splitlines()`, read only the lines needed. This drastically lowers memory use and speeds up file operations for large files. - Uses `itertools.islice` to efficiently pluck only relevant lines. 2. **String Manipulation Reduction:** - Reduce the number of intermediate string allocations by reusing objects as much as possible and joining lines only once. - Avoids `strip()` unless absolutely necessary (as likely only for code content). 3. **Variable Lookup:** - Minimize attribute lookups that are inside loops. The function semantics are preserved exactly. All comments are retained or improved for code that was changed for better understanding. ### Rationale - The main bottleneck is reading full files and splitting them when only a small region is needed. By slicing only the relevant lines from file, the function becomes much faster for large files or high call counts. - All behaviors, including fallback and hash calculation, are unchanged. - Import of `islice` is local and lightweight. **This should significantly improve both runtime and memory usage of `get_code_context_hash`.**
          ⚡️ Codeflash found optimizations for this PR📄 15% (0.15x) speedup for 
 | 
    
| 
           Looks like there are a few issues preventing this PR from being merged! 
 If you'd like me to help, just leave a comment, like 
 Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings  | 
    
| ) | ||
| for test_index, (test_path, test_perf_path) in enumerate( | ||
| zip(generated_test_paths, generated_perf_test_paths) | ||
| zip(generated_test_paths, generated_perf_test_paths, strict=False) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this for 3.9
Adds a call to cfapi to push hashes of the function code context in order to check if the function is being optimized again.
One thing to note here is that this checking logic happens early, in function discovery.