Skip to content

Conversation

@aseembits93
Copy link
Contributor

@aseembits93 aseembits93 commented Oct 22, 2025

PR Type

Enhancement


Description

  • Increase AI request timeouts to 120s

  • Rename payload key to optimizationReview

  • Gate PR creation on non-low review

  • Initialize optimization review response


Diagram Walkthrough

flowchart LR
  timeout60["AI timeouts: 60s"] -- "increase" --> timeout120["AI timeouts: 120s"]
  impactKey["payload key 'optimizationImpact'"] -- "rename" --> reviewKey["'optimizationReview'"]
  reviewResp["opt_review_response init"] -- "set empty" --> initState["initialized state"]
  prFlow["PR creation"] -- "only if review != 'low'" --> gatedPR["gated PR creation"]
Loading

File Walkthrough

Relevant files
Enhancement
aiservice.py
Extend AI service request timeouts to 120s                             

codeflash/api/aiservice.py

  • Increase refinement timeout from 60s to 120s
  • Increase optimization review timeout from 60s to 120s
+2/-2     
cfapi.py
Standardize payload key to optimizationReview                       

codeflash/api/cfapi.py

  • Rename payload key to optimizationReview
  • Apply rename across suggest, create_pr, create_staging
  • Update legacy comment to reflect rename
+3/-3     
function_optimizer.py
Gate PR creation on optimization review outcome                   

codeflash/optimization/function_optimizer.py

  • Initialize opt_review_response to empty string
  • Gate PR creation when review is not "low"
  • Preserve staging flow and error handling
+2/-3     

@github-actions
Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Backward Compatibility

The payload key was renamed to optimizationReview. Validate that all downstream consumers (backend endpoints and any JS/TS clients) accept this new key, since optimizationImpact is fully removed here.

        "diffContents": file_changes,
        "prCommentFields": pr_comment.to_json(),
        "existingTests": existing_tests,
        "generatedTests": generated_tests,
        "traceId": trace_id,
        "coverage_message": coverage_message,
        "replayTests": replay_tests,
        "concolicTests": concolic_tests,
        "optimizationReview": optimization_review,  # impact keyword left for legacy reasons, touches js/ts code
    }
    return make_cfapi_request(endpoint="/suggest-pr-changes", method="POST", payload=payload)


def create_pr(
    owner: str,
    repo: str,
    base_branch: str,
    file_changes: dict[str, FileDiffContent],
    pr_comment: PrComment,
    existing_tests: str,
    generated_tests: str,
    trace_id: str,
    coverage_message: str,
    replay_tests: str = "",
    concolic_tests: str = "",
    optimization_review: str = "",
) -> Response:
    """Create a pull request, targeting the specified branch. (usually 'main').

    :param owner: The owner of the repository.
    :param repo: The name of the repository.
    :param base_branch: The base branch to target.
    :param file_changes: A dictionary of file changes.
    :param pr_comment: The pull request comment object, containing the optimization explanation, best runtime, etc.
    :param generated_tests: The generated tests.
    :return: The response object.
    """
    # convert Path objects to strings
    payload = {
        "owner": owner,
        "repo": repo,
        "baseBranch": base_branch,
        "diffContents": file_changes,
        "prCommentFields": pr_comment.to_json(),
        "existingTests": existing_tests,
        "generatedTests": generated_tests,
        "traceId": trace_id,
        "coverage_message": coverage_message,
        "replayTests": replay_tests,
        "concolicTests": concolic_tests,
        "optimizationReview": optimization_review,  # Impact keyword left for legacy reasons, it touches js/ts codebase
    }
Logic Change

PR creation is now gated on opt_review_response != "low". Confirm the exact shape/value of responses from get_optimization_review (e.g., case sensitivity, whitespace, empty string on failure) to avoid unintentionally skipping valid PRs or creating PRs when the review failed.

    data["root_dir"] = git_root_dir()
    calling_fn_details = get_opt_review_metrics(
        self.function_to_optimize_source_code,
        self.function_to_optimize.file_path,
        self.function_to_optimize.qualified_name,
        self.project_root,
        self.test_cfg.tests_root,
    )
    try:
        opt_review_response = self.aiservice_client.get_optimization_review(
            **data, calling_fn_details=calling_fn_details
        )
    except Exception as e:
        logger.debug(f"optimization review response failed, investigate {e}")
    data["optimization_review"] = opt_review_response
if raise_pr and not staging_review and opt_review_response != "low":
    data["git_remote"] = self.args.git_remote
    check_create_pr(**data)
elif staging_review:
    response = create_staging(**data)
Timeout Increase

Timeouts increased to 120s; ensure calling contexts handle longer waits without blocking UX and that retries/cancellation behavior remains appropriate.

        for opt in request
    ]
    logger.debug(f"Refining {len(request)} optimizations…")
    console.rule()
    try:
        response = self.make_ai_service_request("/refinement", payload=payload, timeout=120)
    except requests.exceptions.RequestException as e:
        logger.exception(f"Error generating optimization refinements: {e}")
        ph("cli-optimize-error-caught", {"error": str(e)})
        return []

    if response.status_code == 200:
        refined_optimizations = response.json()["refinements"]
        logger.debug(f"Generated {len(refined_optimizations)} candidate refinements.")
        console.rule()

        refinements = self._get_valid_candidates(refined_optimizations)
        return [
            OptimizedCandidate(
                source_code=c.source_code,
                explanation=c.explanation,
                optimization_id=c.optimization_id[:-4] + "refi",
            )
            for c in refinements
        ]

    try:
        error = response.json()["error"]
    except Exception:
        error = response.text
    logger.error(f"Error generating optimized candidates: {response.status_code} - {error}")
    ph("cli-optimize-error-response", {"response_status_code": response.status_code, "error": error})
    console.rule()
    return []

def get_new_explanation(  # noqa: D417
    self,
    source_code: str,
    optimized_code: str,
    dependency_code: str,
    trace_id: str,
    original_line_profiler_results: str,
    optimized_line_profiler_results: str,
    original_code_runtime: str,
    optimized_code_runtime: str,
    speedup: str,
    annotated_tests: str,
    optimization_id: str,
    original_explanation: str,
    original_throughput: str | None = None,
    optimized_throughput: str | None = None,
    throughput_improvement: str | None = None,
) -> str:
    """Optimize the given python code for performance by making a request to the Django endpoint.

    Parameters
    ----------
    - source_code (str): The python code to optimize.
    - optimized_code (str): The python code generated by the AI service.
    - dependency_code (str): The dependency code used as read-only context for the optimization
    - original_line_profiler_results: str - line profiler results for the baseline code
    - optimized_line_profiler_results: str - line profiler results for the optimized code
    - original_code_runtime: str - runtime for the baseline code
    - optimized_code_runtime: str - runtime for the optimized code
    - speedup: str - speedup of the optimized code
    - annotated_tests: str - test functions annotated with runtime
    - optimization_id: str - unique id of opt candidate
    - original_explanation: str - original_explanation generated for the opt candidate
    - original_throughput: str | None - throughput for the baseline code (operations per second)
    - optimized_throughput: str | None - throughput for the optimized code (operations per second)
    - throughput_improvement: str | None - throughput improvement percentage

    Returns
    -------
    - List[OptimizationCandidate]: A list of Optimization Candidates.

    """
    payload = {
        "trace_id": trace_id,
        "source_code": source_code,
        "optimized_code": optimized_code,
        "original_line_profiler_results": original_line_profiler_results,
        "optimized_line_profiler_results": optimized_line_profiler_results,
        "original_code_runtime": original_code_runtime,
        "optimized_code_runtime": optimized_code_runtime,
        "speedup": speedup,
        "annotated_tests": annotated_tests,
        "optimization_id": optimization_id,
        "original_explanation": original_explanation,
        "dependency_code": dependency_code,
        "original_throughput": original_throughput,
        "optimized_throughput": optimized_throughput,
        "throughput_improvement": throughput_improvement,
    }
    logger.info("loading|Generating explanation")
    console.rule()
    try:
        response = self.make_ai_service_request("/explain", payload=payload, timeout=60)
    except requests.exceptions.RequestException as e:
        logger.exception(f"Error generating explanations: {e}")
        ph("cli-optimize-error-caught", {"error": str(e)})
        return ""

    if response.status_code == 200:
        explanation: str = response.json()["explanation"]
        console.rule()
        return explanation
    try:
        error = response.json()["error"]
    except Exception:
        error = response.text
    logger.error(f"Error generating optimized candidates: {response.status_code} - {error}")
    ph("cli-optimize-error-response", {"response_status_code": response.status_code, "error": error})
    console.rule()
    return ""

def generate_ranking(  # noqa: D417
    self, trace_id: str, diffs: list[str], optimization_ids: list[str], speedups: list[float]
) -> list[int] | None:
    """Optimize the given python code for performance by making a request to the Django endpoint.

    Parameters
    ----------
    - trace_id : unique uuid of function
    - diffs : list of unified diff strings of opt candidates
    - speedups : list of speedups of opt candidates

    Returns
    -------
    - List[int]: Ranking of opt candidates in decreasing order

    """
    payload = {
        "trace_id": trace_id,
        "diffs": diffs,
        "speedups": speedups,
        "optimization_ids": optimization_ids,
        "python_version": platform.python_version(),
    }
    logger.info("loading|Generating ranking")
    console.rule()
    try:
        response = self.make_ai_service_request("/rank", payload=payload, timeout=60)
    except requests.exceptions.RequestException as e:
        logger.exception(f"Error generating ranking: {e}")
        ph("cli-optimize-error-caught", {"error": str(e)})
        return None

    if response.status_code == 200:
        ranking: list[int] = response.json()["ranking"]
        console.rule()
        return ranking
    try:
        error = response.json()["error"]
    except Exception:
        error = response.text
    logger.error(f"Error generating ranking: {response.status_code} - {error}")
    ph("cli-optimize-error-response", {"response_status_code": response.status_code, "error": error})
    console.rule()
    return None

def log_results(  # noqa: D417
    self,
    function_trace_id: str,
    speedup_ratio: dict[str, float | None] | None,
    original_runtime: float | None,
    optimized_runtime: dict[str, float | None] | None,
    is_correct: dict[str, bool] | None,
    optimized_line_profiler_results: dict[str, str] | None,
    metadata: dict[str, Any] | None,
    optimizations_post: dict[str, str] | None = None,
) -> None:
    """Log features to the database.

    Parameters
    ----------
    - function_trace_id (str): The UUID.
    - speedup_ratio (Optional[Dict[str, float]]): The speedup.
    - original_runtime (Optional[Dict[str, float]]): The original runtime.
    - optimized_runtime (Optional[Dict[str, float]]): The optimized runtime.
    - is_correct (Optional[Dict[str, bool]]): Whether the optimized code is correct.
    - optimized_line_profiler_results: line_profiler results for every candidate mapped to their optimization_id
    - metadata: contains the best optimization id
    - optimizations_post - dict mapping opt id to code str after postprocessing

    """
    payload = {
        "trace_id": function_trace_id,
        "speedup_ratio": speedup_ratio,
        "original_runtime": original_runtime,
        "optimized_runtime": optimized_runtime,
        "is_correct": is_correct,
        "codeflash_version": codeflash_version,
        "optimized_line_profiler_results": optimized_line_profiler_results,
        "metadata": metadata,
        "optimizations_post": optimizations_post,
    }
    try:
        self.make_ai_service_request("/log_features", payload=payload, timeout=5)
    except requests.exceptions.RequestException as e:
        logger.exception(f"Error logging features: {e}")

def generate_regression_tests(  # noqa: D417
    self,
    source_code_being_tested: str,
    function_to_optimize: FunctionToOptimize,
    helper_function_names: list[str],
    module_path: Path,
    test_module_path: Path,
    test_framework: str,
    test_timeout: int,
    trace_id: str,
    test_index: int,
) -> tuple[str, str, str] | None:
    """Generate regression tests for the given function by making a request to the Django endpoint.

    Parameters
    ----------
    - source_code_being_tested (str): The source code of the function being tested.
    - function_to_optimize (FunctionToOptimize): The function to optimize.
    - helper_function_names (list[Source]): List of helper function names.
    - module_path (Path): The module path where the function is located.
    - test_module_path (Path): The module path for the test code.
    - test_framework (str): The test framework to use, e.g., "pytest".
    - test_timeout (int): The timeout for each test in seconds.
    - test_index (int): The index from 0-(n-1) if n tests are generated for a single trace_id

    Returns
    -------
    - Dict[str, str] | None: The generated regression tests and instrumented tests, or None if an error occurred.

    """
    assert test_framework in ["pytest", "unittest"], (
        f"Invalid test framework, got {test_framework} but expected 'pytest' or 'unittest'"
    )
    payload = {
        "source_code_being_tested": source_code_being_tested,
        "function_to_optimize": function_to_optimize,
        "helper_function_names": helper_function_names,
        "module_path": module_path,
        "test_module_path": test_module_path,
        "test_framework": test_framework,
        "test_timeout": test_timeout,
        "trace_id": trace_id,
        "test_index": test_index,
        "python_version": platform.python_version(),
        "codeflash_version": codeflash_version,
        "is_async": function_to_optimize.is_async,
    }
    try:
        response = self.make_ai_service_request("/testgen", payload=payload, timeout=90)
    except requests.exceptions.RequestException as e:
        logger.exception(f"Error generating tests: {e}")
        ph("cli-testgen-error-caught", {"error": str(e)})
        return None

    # the timeout should be the same as the timeout for the AI service backend

    if response.status_code == 200:
        response_json = response.json()
        logger.debug(f"Generated tests for function {function_to_optimize.function_name}")
        return (
            response_json["generated_tests"],
            response_json["instrumented_behavior_tests"],
            response_json["instrumented_perf_tests"],
        )
    try:
        error = response.json()["error"]
        logger.error(f"Error generating tests: {response.status_code} - {error}")
        ph("cli-testgen-error-response", {"response_status_code": response.status_code, "error": error})
        return None  # noqa: TRY300
    except Exception:
        logger.error(f"Error generating tests: {response.status_code} - {response.text}")
        ph("cli-testgen-error-response", {"response_status_code": response.status_code, "error": response.text})
        return None

def get_optimization_review(
    self,
    original_code: dict[Path, str],
    new_code: dict[Path, str],
    explanation: Explanation,
    existing_tests_source: str,
    generated_original_test_source: str,
    function_trace_id: str,
    coverage_message: str,
    replay_tests: str,
    root_dir: Path,
    concolic_tests: str,  # noqa: ARG002
    calling_fn_details: str,
) -> str:
    """Compute the optimization review of current Pull Request.

    Args:
    original_code: dict -> data structure mapping file paths to function definition for original code
    new_code: dict -> data structure mapping file paths to function definition for optimized code
    explanation: Explanation -> data structure containing runtime information
    existing_tests_source: str -> existing tests table
    generated_original_test_source: str -> annotated generated tests
    function_trace_id: str -> traceid of function
    coverage_message: str -> coverage information
    replay_tests: str -> replay test table
    root_dir: Path -> path of git directory
    concolic_tests: str -> concolic_tests (not used)
    calling_fn_details: str -> filenames and definitions of functions which call the function_to_optimize

    Returns:
    -------
    - 'high', 'medium' or 'low' optimization review

    """
    diff_str = "\n".join(
        [
            unified_diff_strings(
                code1=original_code[p],
                code2=new_code[p],
                fromfile=Path(p).relative_to(root_dir).as_posix(),
                tofile=Path(p).relative_to(root_dir).as_posix(),
            )
            for p in original_code
            if not is_zero_diff(original_code[p], new_code[p])
        ]
    )
    code_diff = f"```diff\n{diff_str}\n```"
    logger.info("!lsp|Computing Optimization Review…")
    payload = {
        "code_diff": code_diff,
        "explanation": explanation.raw_explanation_message,
        "existing_tests": existing_tests_source,
        "generated_tests": generated_original_test_source,
        "trace_id": function_trace_id,
        "coverage_message": coverage_message,
        "replay_tests": replay_tests,
        "speedup": f"{(100 * float(explanation.speedup)):.2f}%",
        "loop_count": explanation.winning_benchmarking_test_results.number_of_loops(),
        "benchmark_details": explanation.benchmark_details if explanation.benchmark_details else None,
        "optimized_runtime": humanize_runtime(explanation.best_runtime_ns),
        "original_runtime": humanize_runtime(explanation.original_runtime_ns),
        "calling_fn_details": calling_fn_details,
    }
    console.rule()
    try:
        response = self.make_ai_service_request("/optimization_review", payload=payload, timeout=120)
    except requests.exceptions.RequestException as e:

@github-actions
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Preserve backward-compatible payload key

Renaming the payload key from optimizationImpact to optimizationReview may break
backend/JS consumers expecting the legacy key. To avoid runtime failures, send both
keys for backward compatibility until downstream is updated.

codeflash/api/cfapi.py [161]

-"optimizationReview": optimization_review,  # impact keyword left for legacy reasons, touches js/ts code
+"optimizationImpact": optimization_review,  # kept for backward compatibility with js/ts clients
+"optimizationReview": optimization_review,
Suggestion importance[1-10]: 7

__

Why: Renaming optimizationImpact to optimizationReview may break existing JS/TS consumers; sending both keys is a practical compatibility safeguard. The improved code correctly reflects the change and is directly applicable to the modified payload line.

Medium
Send both keys across endpoints

This key rename is repeated in multiple endpoints and can break different consumers.
Provide both optimizationImpact and optimizationReview in create_pr and
create_staging as well to ensure consistent compatibility.

codeflash/api/cfapi.py [203]

-"optimizationReview": optimization_review,  # Impact keyword left for legacy reasons, it touches js/ts codebase
+"optimizationImpact": optimization_review,  # kept for backward compatibility with js/ts clients
+"optimizationReview": optimization_review,
Suggestion importance[1-10]: 7

__

Why: The same rename occurs in create_pr and create_staging; providing both keys ensures consistent backward compatibility. The proposed code aligns with the updated lines and avoids breaking downstream consumers.

Medium
General
Normalize review threshold comparison

Comparing opt_review_response to the string "low" can misroute logic if the service
returns non-string, structured, or differently cased values. Normalize and guard the
comparison to avoid skipping PR creation unexpectedly.

codeflash/optimization/function_optimizer.py [1478-1481]

-if raise_pr and not staging_review and opt_review_response != "low":
+normalized_review = str(opt_review_response).strip().lower()
+if raise_pr and not staging_review and normalized_review != "low":
     data["git_remote"] = self.args.git_remote
     check_create_pr(**data)
Suggestion importance[1-10]: 6

__

Why: Normalizing opt_review_response before comparison prevents logic errors if the service returns varied types or casing. It's a reasonable robustness improvement, though not critical and should be verified against expected response shapes.

Low

@misrasaurabh1 misrasaurabh1 merged commit 78d707f into main Oct 22, 2025
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants