Skip to content

Conversation

@misrasaurabh1
Copy link
Contributor

@misrasaurabh1 misrasaurabh1 commented Jul 17, 2025

PR Type

Enhancement


Description

  • Add code refinement pipeline integration

  • Accumulate and refine optimization candidates

  • Multi-criteria ranking by diff & runtime

  • Refactor test env and profiling helpers


Diagram Walkthrough

flowchart LR
  A["Initial optimizations"] --> B["Accumulate candidates"]
  B --> C["Request refinements"]
  C --> D["Merge refined candidates"]
  D --> E["Rank by diff & runtime"]
  E --> F["Select best optimization"]
Loading

File Walkthrough

Relevant files
Enhancement
aiservice.py
Integrate refinement request and logging enhancements       

codeflash/api/aiservice.py

  • Added optimize_python_code_refinement method
  • Extended make_ai_service_request payload types
  • Introduced safe_get_repo_owner_and_name helper
  • Included optimized_line_profiler_results in logging
+78/-7   
code_utils.py
Add diff and ranking utility functions                                     

codeflash/code_utils/code_utils.py

  • Added diff_length for unified diff sizing
  • Introduced create_rank_dictionary_compact utility
  • Imported difflib dependency
+45/-0   
models.py
Define refiner request and model updates                                 

codeflash/models/models.py

  • Defined AIServiceRefinerRequest dataclass
  • Extended BestOptimization with new fields
+18/-0   
function_optimizer.py
Implement candidate refinement and ranking                             

codeflash/optimization/function_optimizer.py

  • Introduced valid_optimizations list and refine flow
  • Added refine_optimizations candidate refinement method
  • Refactored test env and profiling helpers
  • Implemented diff & runtime ranking logic
+159/-48
Bug fix
critic.py
Support None baseline in speedup_critic                                   

codeflash/result/critic.py

  • Updated speedup_critic to accept None baseline
  • Fallback to perf_gain when no best runtime
+4/-1     

@github-actions
Copy link

github-actions bot commented Jul 17, 2025

PR Reviewer Guide 🔍

(Review updated until commit c596e12)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

RequestSchemaMismatch

The AIServiceRefinerRequest dataclass lacks fields experiment_metadata and fto_name used in refine_optimizations, causing unexpected field errors during instantiation.

@dataclass(frozen=True)
class AIServiceRefinerRequest:
    optimization_id: str
    original_source_code: str
    read_only_dependency_code: str
    original_code_runtime: str
    optimized_source_code: str
    optimized_explanation: str
    optimized_code_runtime: str
    speedup: str
    trace_id: str
    original_line_profiler_results: str
    optimized_line_profiler_results: str
BestRuntimeLogic

The _best_runtime_until_now variable is initialized but never passed to speedup_critic; as a result, the critic always receives None and prior best runtimes aren't considered in selection.

) -> BestOptimization | None:
    best_optimization: BestOptimization | None = None
    _best_runtime_until_now = original_code_baseline.runtime

    speedup_ratios: dict[str, float | None] = {}
    optimized_runtimes: dict[str, float | None] = {}
    is_correct = {}
    optimized_line_profiler_results: dict[str, str] = {}
IncorrectImport

AIServiceRefinerRequest is imported from codeflash.api.aiservice, but it's defined in codeflash/models/models.py, which may lead to import errors or circular dependencies.

from codeflash.api.aiservice import AiServiceClient, AIServiceRefinerRequest, LocalAiServiceClient

@github-actions
Copy link

github-actions bot commented Jul 17, 2025

PR Code Suggestions ✨

Latest suggestions up to c596e12
Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Define missing dataclass fields

The AIServiceRefinerRequest dataclass is constructed with experiment_metadata and
fto_name in refine_optimizations but those fields aren’t defined. Add matching
fields to the dataclass so the init call doesn’t fail.

codeflash/models/models.py [31-43]

 @dataclass(frozen=True)
 class AIServiceRefinerRequest:
     optimization_id: str
     original_source_code: str
     read_only_dependency_code: str
     original_code_runtime: str
     optimized_source_code: str
     optimized_explanation: str
     optimized_code_runtime: str
     speedup: str
     trace_id: str
     original_line_profiler_results: str
     optimized_line_profiler_results: str
+    experiment_metadata: ExperimentMetadata | None
+    fto_name: str
Suggestion importance[1-10]: 9

__

Why: The AIServiceRefinerRequest dataclass is missing experiment_metadata and fto_name fields, which are passed in refine_optimizations, causing a constructor error at runtime.

High

Previous suggestions

Suggestions up to commit 65d2971
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix payload field name

The payload uses a non-existent field original_read_only_dependency_code; it should
match the dataclass field read_only_dependency_code. Rename both the key and
attribute reference to keep them consistent.

codeflash/api/aiservice.py [241-242]

 {
     "original_source_code": opt.original_source_code,
-    "original_read_only_dependency_code": opt.original_read_only_dependency_code,
+    "read_only_dependency_code": opt.read_only_dependency_code,
     ...
 }
Suggestion importance[1-10]: 9

__

Why: The payload key "original_read_only_dependency_code" and attribute opt.original_read_only_dependency_code do not exist on AIServiceRefinerRequest, so renaming to read_only_dependency_code is essential to avoid runtime errors.

High
General
Move refinement block

This block is inside the while candidates: loop and will never run because the loop
exits when len(candidates)==0. Move it outside the loop so refinement runs once
after processing all candidates.

codeflash/optimization/function_optimizer.py [543-571]

-if len(candidates) == 0 and len(self.valid_optimizations) > 0 and not refinement_done:
+# after the while-candidates loop ends
+if len(self.valid_optimizations) > 0 and not refinement_done:
     # ... refine_optimizations ...
     candidates.extend(more_opt_candidates)
     refinement_done = True
Suggestion importance[1-10]: 9

__

Why: The if len(candidates) == 0 check is inside the while candidates: loop and thus never executes, so relocating it after the loop is critical for the refinement step to run.

High
Serialize metadata field

Sending a Pydantic dataclass instance directly may fail JSON serialization. Convert
the experiment_metadata to a dict (or JSON) before adding it to the payload.

codeflash/api/aiservice.py [252]

-"experiment_metadata": opt.experiment_metadata,
+"experiment_metadata": opt.experiment_metadata.dict() if opt.experiment_metadata else None,
Suggestion importance[1-10]: 6

__

Why: Passing a Pydantic dataclass directly may not serialize with requests.json, so converting opt.experiment_metadata to a dict ensures the payload is JSON-serializable.

Low

codeflash-ai bot added a commit that referenced this pull request Jul 17, 2025
Here’s an optimized version that preserves all existing function signatures, logic, and return values but reduces unnecessary overhead, short-circuits early, and eliminates redundant object lookups and function calls.

**Key Optimizations:**
- Use local variable binding early in `get_pr_number` to avoid repeated imports/GL lookups for `get_cached_gh_event_data`.
- Inline the import of `get_cached_gh_event_data` once at the top—doing so locally in the function is much slower.
- Use early returns in `speedup_critic` after fast checks to avoid unnecessary branches and function calls.
- Remove unneeded bool() wrappers where the result is already bool.
- Use direct access to already-imported functions instead of accessing via module (inlining `env_utils.get_pr_number`).



**Summary**:  
All function return values and signatures are preserved. Redundant lookups are eliminated, external calls are reduced, and fast-path branches short-circuit unnecessary logic to reduce overall runtime and memory allocations. Comments are preserved unless the associated code was optimized.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Jul 17, 2025

⚡️ Codeflash found optimizations for this PR

📄 15% (0.15x) speedup for speedup_critic in codeflash/result/critic.py

⏱️ Runtime : 1.84 milliseconds 1.60 milliseconds (best of 56 runs)

I created a new dependent PR with the suggested changes. Please review:

If you approve, it will be merged into this PR (branch refinement).

@aseembits93 aseembits93 requested a review from KRRT7 July 28, 2025 17:21
aseembits93
aseembits93 previously approved these changes Jul 28, 2025
Copy link
Contributor

@aseembits93 aseembits93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aseembits93 aseembits93 marked this pull request as ready for review July 28, 2025 17:21
@github-actions
Copy link

Persistent review updated to latest commit c596e12

@aseembits93 aseembits93 enabled auto-merge July 28, 2025 21:46
@aseembits93 aseembits93 merged commit b478f10 into main Jul 28, 2025
17 checks passed
@aseembits93 aseembits93 deleted the refinement branch August 17, 2025 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants