Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Sep 26, 2025

⚡️ This pull request contains optimizations for PR #678

If you approve this dependent PR, these changes will be merged into the original PR branch standalone-fto-async.

This PR will be automatically closed if the original PR is merged.


📄 11% (0.11x) speedup for CommentMapper.visit_FunctionDef in codeflash/code_utils/edit_generated_tests.py

⏱️ Runtime : 2.62 milliseconds 2.36 milliseconds (best of 295 runs)

📝 Explanation and details

The optimized code achieves an 11% speedup through several targeted micro-optimizations that reduce attribute lookups and function call overhead in the hot path:

Key Optimizations:

  1. Local Variable Caching: Frequently accessed attributes (self.context_stack, self.original_runtimes, etc.) are stored in local variables at function start, eliminating repeated self. lookups in tight loops.

  2. Method Reference Caching: self.get_comment is cached as get_comment, and context_stack.append/pop are stored locally, reducing method lookup overhead in the main processing loop.

  3. Type Tuple Pre-computation: The commonly used type tuples for isinstance checks are stored in local variables (_stmt_types, _node_stmt_assign), avoiding tuple creation on every iteration.

  4. Optimized Node Collection: The inefficient pattern of creating a list then extending it (nodes_to_check = [...]; nodes_to_check.extend(...)) is replaced with conditional unpacking ([compound_line_node, *body_attr] or [compound_line_node]), reducing list operations.

  5. f-string Usage: String concatenations for inv_id and match_key are converted to f-strings, which are faster than concatenation operations.

Performance Characteristics:

  • Best gains on large-scale test cases (24.9-32.5% faster) with many nested blocks or statements, where the micro-optimizations compound
  • Minimal overhead on simple cases (0.6-7.9% variance), showing the optimizations don't hurt baseline performance
  • Most effective when processing complex ASTs with deep nesting, as seen in the test_large_many_nested_blocks (24.9% faster) and test_large_sparse_runtime_keys (32.5% faster) cases

The optimizations target the innermost loops where attribute lookups and object creation happen most frequently, making them particularly effective for batch AST processing workflows.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 60 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import ast

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.edit_generated_tests import CommentMapper
from codeflash.code_utils.time_utils import format_perf, format_time
from codeflash.models.models import GeneratedTests
from codeflash.result.critic import performance_gain


# Helper classes and functions for testing
class DummyPath:
    """A dummy Path-like object for testing .with_suffix()."""
    def __init__(self, val):
        self.val = val
    def with_suffix(self, suffix=""):
        # Always return self for simplicity
        return self
    def __str__(self):
        return self.val

class DummyTest:
    """A dummy GeneratedTests-like object for testing."""
    def __init__(self, path):
        self.behavior_file_path = DummyPath(path)

# Helper to parse code and get FunctionDef node
def get_funcdef_node(code: str) -> ast.FunctionDef:
    tree = ast.parse(code)
    for node in tree.body:
        if isinstance(node, ast.FunctionDef):
            return node
    raise ValueError("No FunctionDef found")

# Helper to simulate runtime dicts
def make_runtime_dict(key_base, inv_ids, orig, opt):
    """Create runtime dicts with keys for inv_ids and values for orig/opt."""
    orig_dict = {}
    opt_dict = {}
    for inv_id, o, p in zip(inv_ids, orig, opt):
        k = f"{key_base}#{inv_id}"
        orig_dict[k] = o
        opt_dict[k] = p
    return orig_dict, opt_dict

# --- Basic Test Cases ---

def test_basic_single_statement():
    """Test a simple function with one statement."""
    code = "def foo():\n    x = 1"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "foo#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0"], [100], [50])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 7.92μs -> 8.32μs (4.82% slower)
    comment = mapper.results[2]

def test_basic_multiple_statements():
    """Test a function with multiple statements."""
    code = "def bar():\n    a = 1\n    b = 2\n    c = 3"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "bar#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["2", "1", "0"], [10, 20, 30], [8, 18, 25])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 12.0μs -> 11.9μs (0.763% faster)
    # Should have results for lines 2, 3, 4
    for lineno in [2, 3, 4]:
        comment = mapper.results[lineno]

def test_basic_with_control_flow():
    """Test a function with an if statement and assignments inside."""
    code = (
        "def baz():\n"
        "    if True:\n"
        "        x = 1\n"
        "        y = 2\n"
        "    z = 3"
    )
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "baz#file.py"
    # inv_ids: "0_1" for y=2, "0_0" for x=1, "1" for z=3
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0_1", "0_0", "1"], [100, 200, 300], [90, 180, 250])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 13.6μs -> 13.0μs (4.86% faster)
    # Should have results for lines 3, 4, 5
    for lineno in [3, 4, 5]:
        comment = mapper.results[lineno]

# --- Edge Test Cases ---

def test_edge_no_body():
    """Test a function with no body."""
    code = "def empty():\n    pass"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "empty#file.py"
    # No runtime keys
    orig_dict, opt_dict = {}, {}
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 2.89μs -> 3.14μs (7.97% slower)

def test_edge_missing_runtime_keys():
    """Test when runtime keys are missing for some statements."""
    code = "def partial():\n    a = 1\n    b = 2"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "partial#file.py"
    # Only provide runtime for the first statement
    orig_dict, opt_dict = make_runtime_dict(key_base, ["1"], [10], [5])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 7.80μs -> 7.85μs (0.624% slower)

def test_edge_nested_control_flow():
    """Test a function with nested control flow (if inside for)."""
    code = (
        "def nested():\n"
        "    for i in range(2):\n"
        "        if i:\n"
        "            x = i\n"
        "        y = i\n"
        "    z = 0"
    )
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "nested#file.py"
    # inv_ids: "0_1_0" for x=i, "0_1_1" for y=i, "1" for z=0
    orig_dict = {
        f"{key_base}#0_1_0": 50,
        f"{key_base}#0_1_1": 60,
        f"{key_base}#1": 70,
    }
    opt_dict = {
        f"{key_base}#0_1_0": 45,
        f"{key_base}#0_1_1": 55,
        f"{key_base}#1": 65,
    }
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 10.8μs -> 10.4μs (3.95% faster)
    # Should have results for lines 4, 5, 7
    for lineno in [4, 5, 7]:
        pass

def test_edge_async_function():
    """Test that visit_FunctionDef ignores async functions."""
    code = "async def afunc():\n    a = 1"
    tree = ast.parse(code)
    node = tree.body[0]
    test = DummyTest("file.py")
    key_base = "afunc#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0"], [10], [5])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    # Should not process AsyncFunctionDef with visit_FunctionDef
    # So results should be empty
    mapper.visit_FunctionDef(node) # 6.76μs -> 7.14μs (5.33% slower)

def test_edge_non_assign_statement():
    """Test that non-assign statements are skipped."""
    code = "def skip():\n    print('hi')\n    x = 2"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "skip#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["1"], [10], [5])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 7.55μs -> 7.69μs (1.82% slower)

def test_edge_slower_runtime():
    """Test that 'slower' status is shown when optimized_time > original_time."""
    code = "def slow():\n    x = 1"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "slow#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0"], [10], [15])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 6.67μs -> 6.96μs (4.15% slower)

# --- Large Scale Test Cases ---

def test_large_many_statements():
    """Test performance and correctness with a function with many statements."""
    N = 500  # Not exceeding 1000
    code_lines = ["def big():"] + [f"    x{i} = {i}" for i in range(N)]
    code = "\n".join(code_lines)
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "big#file.py"
    inv_ids = [str(i) for i in reversed(range(N))]
    orig = [i*10 for i in range(N)]
    opt = [i*8 for i in range(N)]
    orig_dict, opt_dict = make_runtime_dict(key_base, inv_ids, orig, opt)
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 937μs -> 857μs (9.23% faster)
    # Should have N results, all lines from 2 to N+1
    for lineno in range(2, N+2):
        pass

def test_large_many_nested_blocks():
    """Test performance with many nested control blocks."""
    N = 50  # Not exceeding 1000 total statements
    code_lines = ["def nest():", "    for i in range(1):"]
    for j in range(N):
        code_lines.append(f"        if i == {j}:")
        code_lines.append(f"            x{j} = {j}")
    code = "\n".join(code_lines)
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "nest#file.py"
    inv_ids = [f"0_{j}_0" for j in range(N)]
    orig = [j*100 for j in range(N)]
    opt = [j*80 for j in range(N)]
    orig_dict, opt_dict = make_runtime_dict(key_base, inv_ids, orig, opt)
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 49.1μs -> 39.3μs (24.9% faster)
    # Should have N results for lines 4,6,...,2*N+2
    for idx, lineno in enumerate(range(4, 2*N+4, 2)):
        pass

def test_large_sparse_runtime_keys():
    """Test with many statements but only a few runtime keys present."""
    N = 200
    code_lines = ["def sparse():"] + [f"    x{i} = {i}" for i in range(N)]
    code = "\n".join(code_lines)
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "sparse#file.py"
    # Only every 20th statement has runtime info
    inv_ids = [str(i) for i in reversed(range(N)) if i % 20 == 0]
    orig = [i*100 for i in range(N) if i % 20 == 0]
    opt = [i*90 for i in range(N) if i % 20 == 0]
    orig_dict, opt_dict = make_runtime_dict(key_base, inv_ids, orig, opt)
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 99.5μs -> 75.1μs (32.5% faster)
    # Only lines 2,22,42,... should have results
    for i in range(0, N, 20):
        lineno = i+2
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import ast
from pathlib import Path

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.edit_generated_tests import CommentMapper


class GeneratedTests:
    def __init__(self, behavior_file_path: Path):
        self.behavior_file_path = behavior_file_path

# The function to test is visit_FunctionDef, as a method of CommentMapper above

# Helper to build a CommentMapper with test data
def make_mapper(
    func_name: str,
    original_runtimes: dict[str, int],
    optimized_runtimes: dict[str, int],
    path: str = "myfile.py",
) -> CommentMapper:
    test = GeneratedTests(behavior_file_path=Path(path))
    return CommentMapper(test, original_runtimes, optimized_runtimes)

# Helper to parse code and get the FunctionDef node
def get_funcdef_node(src: str) -> ast.FunctionDef:
    mod = ast.parse(src)
    for node in mod.body:
        if isinstance(node, ast.FunctionDef):
            return node
    raise ValueError("No FunctionDef found")

# -------------------- UNIT TESTS --------------------

# 1. Basic Test Cases

def test_single_statement_function():
    # Function with one statement, runtime data present
    src = "def foo():\n    x = 1"
    node = get_funcdef_node(src)
    # Key: foo#myfile#0
    orig = {"foo#myfile#0": 1000}
    opt = {"foo#myfile#0": 500}
    mapper = make_mapper("foo", orig, opt)
    mapper.visit_FunctionDef(node) # 10.8μs -> 11.2μs (3.50% slower)
    comment = mapper.results[2]

def test_multiple_statements_function():
    # Function with two statements, both present in runtime data
    src = "def bar():\n    x = 1\n    y = 2"
    node = get_funcdef_node(src)
    orig = {"bar#myfile#0": 2000, "bar#myfile#1": 1000}
    opt = {"bar#myfile#0": 1000, "bar#myfile#1": 500}
    mapper = make_mapper("bar", orig, opt)
    mapper.visit_FunctionDef(node) # 13.4μs -> 13.4μs (0.374% faster)

def test_no_runtime_data():
    # Function with statements, but no runtime data
    src = "def baz():\n    x = 1\n    y = 2"
    node = get_funcdef_node(src)
    orig = {}
    opt = {}
    mapper = make_mapper("baz", orig, opt)
    mapper.visit_FunctionDef(node) # 5.93μs -> 5.78μs (2.59% faster)

def test_with_block():
    # Function with a with block, runtime data for inner statement
    src = "def qux():\n    with open('f') as f:\n        x = f.read()"
    node = get_funcdef_node(src)
    # Key: qux#myfile#0_0
    orig = {"qux#myfile#0_0": 4000}
    opt = {"qux#myfile#0_0": 2000}
    mapper = make_mapper("qux", orig, opt)
    mapper.visit_FunctionDef(node) # 11.7μs -> 11.4μs (2.27% faster)

def test_for_loop_block():
    # Function with a for loop, runtime data for inner statement
    src = "def loop():\n    for i in range(2):\n        x = i"
    node = get_funcdef_node(src)
    orig = {"loop#myfile#0_0": 3000}
    opt = {"loop#myfile#0_0": 1000}
    mapper = make_mapper("loop", orig, opt)
    mapper.visit_FunctionDef(node) # 11.5μs -> 11.0μs (4.73% faster)

# 2. Edge Test Cases

def test_empty_function():
    # Function with no body
    src = "def empty():\n    pass"
    node = get_funcdef_node(src)
    orig = {}
    opt = {}
    mapper = make_mapper("empty", orig, opt)
    mapper.visit_FunctionDef(node) # 5.17μs -> 5.14μs (0.584% faster)

def test_missing_optimized_runtime():
    # Function with original runtime but missing optimized
    src = "def foo():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {"foo#myfile#0": 1000}
    opt = {}
    mapper = make_mapper("foo", orig, opt)
    mapper.visit_FunctionDef(node) # 5.16μs -> 5.35μs (3.53% slower)

def test_missing_original_runtime():
    # Function with optimized runtime but missing original
    src = "def foo():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {}
    opt = {"foo#myfile#0": 500}
    mapper = make_mapper("foo", orig, opt)
    mapper.visit_FunctionDef(node) # 5.07μs -> 5.24μs (3.23% slower)

def test_nested_blocks():
    # Function with nested blocks (if inside for)
    src = (
        "def nest():\n"
        "    for i in range(2):\n"
        "        if i:\n"
        "            x = i\n"
    )
    node = get_funcdef_node(src)
    # Key: nest#myfile#0_0_0
    orig = {"nest#myfile#0_0_0": 5000}
    opt = {"nest#myfile#0_0_0": 2500}
    mapper = make_mapper("nest", orig, opt)
    mapper.visit_FunctionDef(node) # 7.16μs -> 6.92μs (3.47% faster)

def test_slower_performance():
    # Function where optimized runtime is slower
    src = "def slow():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {"slow#myfile#0": 1000}
    opt = {"slow#myfile#0": 2000}
    mapper = make_mapper("slow", orig, opt)
    mapper.visit_FunctionDef(node) # 10.7μs -> 10.8μs (0.380% slower)

def test_zero_original_runtime():
    # Original runtime is zero, should not crash
    src = "def zero():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {"zero#myfile#0": 0}
    opt = {"zero#myfile#0": 0}
    mapper = make_mapper("zero", orig, opt)
    mapper.visit_FunctionDef(node) # 8.87μs -> 9.11μs (2.64% slower)

def test_non_assign_stmt_in_block():
    # Compound block with a non-assign statement (e.g., Expr)
    src = "def expr():\n    for i in range(2):\n        print(i)"
    node = get_funcdef_node(src)
    orig = {"expr#myfile#0_0": 6000}
    opt = {"expr#myfile#0_0": 3000}
    mapper = make_mapper("expr", orig, opt)
    mapper.visit_FunctionDef(node) # 11.9μs -> 11.3μs (5.43% faster)

def test_multiple_blocks():
    # Function with multiple compound blocks
    src = (
        "def multi():\n"
        "    for i in range(2):\n"
        "        x = i\n"
        "    while True:\n"
        "        y = 1\n"
    )
    node = get_funcdef_node(src)
    orig = {
        "multi#myfile#0_0": 1000,
        "multi#myfile#1_0": 2000,
    }
    opt = {
        "multi#myfile#0_0": 500,
        "multi#myfile#1_0": 1000,
    }
    mapper = make_mapper("multi", orig, opt)
    mapper.visit_FunctionDef(node) # 15.7μs -> 15.1μs (3.57% faster)

def test_body_with_else():
    # If block with else, both branches
    src = (
        "def branch():\n"
        "    if True:\n"
        "        x = 1\n"
        "    else:\n"
        "        y = 2\n"
    )
    node = get_funcdef_node(src)
    orig = {
        "branch#myfile#0_0": 1000,
        "branch#myfile#0_1": 2000,
    }
    opt = {
        "branch#myfile#0_0": 500,
        "branch#myfile#0_1": 1000,
    }
    mapper = make_mapper("branch", orig, opt)
    mapper.visit_FunctionDef(node) # 11.2μs -> 11.0μs (1.91% faster)

def test_body_with_if_and_assign():
    # If block with assign in body and else
    src = (
        "def if_assign():\n"
        "    if True:\n"
        "        x = 1\n"
        "    else:\n"
        "        y = 2\n"
        "    z = 3\n"
    )
    node = get_funcdef_node(src)
    orig = {
        "if_assign#myfile#0_0": 1000,
        "if_assign#myfile#0_1": 2000,
        "if_assign#myfile#1": 3000,
    }
    opt = {
        "if_assign#myfile#0_0": 500,
        "if_assign#myfile#0_1": 1000,
        "if_assign#myfile#1": 1500,
    }
    mapper = make_mapper("if_assign", orig, opt)
    mapper.visit_FunctionDef(node) # 14.6μs -> 14.3μs (2.60% faster)

# 3. Large Scale Test Cases

def test_large_function_body():
    # Function with 500 statements
    src_lines = ["def bigfunc():"]
    for i in range(500):
        src_lines.append(f"    x{i} = {i}")
    src = "\n".join(src_lines)
    node = get_funcdef_node(src)
    orig = {f"bigfunc#myfile#{i}": 1000 + i for i in range(500)}
    opt = {f"bigfunc#myfile#{i}": 500 + i for i in range(500)}
    mapper = make_mapper("bigfunc", orig, opt)
    mapper.visit_FunctionDef(node) # 937μs -> 843μs (11.1% faster)
    # All lines should have comments
    for lineno in range(2, 502):
        pass

def test_large_nested_blocks():
    # Function with 100 for loops, each with one assign
    src_lines = ["def manyloops():"]
    for i in range(100):
        src_lines.append(f"    for j{i} in range(2):")
        src_lines.append(f"        x{i} = j{i}")
    src = "\n".join(src_lines)
    node = get_funcdef_node(src)
    orig = {f"manyloops#myfile#{i}_0": 2000 + i for i in range(100)}
    opt = {f"manyloops#myfile#{i}_0": 1000 + i for i in range(100)}
    mapper = make_mapper("manyloops", orig, opt)
    mapper.visit_FunctionDef(node) # 264μs -> 238μs (11.0% faster)
    # Each loop's inner assign should have comment
    for i in range(100):
        lineno = 2 + i * 2

def test_large_mixed_blocks():
    # Function with alternating for/while/if blocks, up to 50 blocks
    src_lines = ["def mixed():"]
    for i in range(50):
        src_lines.append(f"    for k{i} in range(2):")
        src_lines.append(f"        if k{i} % 2 == 0:")
        src_lines.append(f"            x{i} = k{i}")
        src_lines.append(f"        else:")
        src_lines.append(f"            y{i} = k{i}")
        src_lines.append(f"    while True:")
        src_lines.append(f"        z{i} = k{i}")
    src = "\n".join(src_lines)
    node = get_funcdef_node(src)
    orig = {}
    opt = {}
    for i in range(50):
        orig[f"mixed#myfile#{i}_0_0"] = 1000 + i
        opt[f"mixed#myfile#{i}_0_0"] = 500 + i
        orig[f"mixed#myfile#{i}_0_1"] = 2000 + i
        opt[f"mixed#myfile#{i}_0_1"] = 1000 + i
        orig[f"mixed#myfile#{i}_1_0"] = 3000 + i
        opt[f"mixed#myfile#{i}_1_0"] = 1500 + i
    mapper = make_mapper("mixed", orig, opt)
    mapper.visit_FunctionDef(node) # 110μs -> 83.5μs (32.3% faster)
    # Each block's inner assigns should have comments
    for i in range(50):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr678-2025-09-26T19.48.07 and push.

Codeflash

mohammedahmed18 and others added 30 commits August 22, 2025 05:58
[LSP] Ensure optimizer cleanup on server shutdown or when the client suddenly disconnects
…licate-global-assignments-when-reverting-helpers
…/duplicate-global-assignments-when-reverting-helpers`)

The optimized code achieves a **17% speedup** by eliminating redundant CST parsing operations, which are the most expensive parts of the function according to the line profiler.

**Key optimizations:**

1. **Eliminate duplicate parsing**: The original code parsed `src_module_code` and `dst_module_code` multiple times. The optimized version introduces `_extract_global_statements_once()` that parses each module only once and reuses the parsed CST objects throughout the function.

2. **Reuse parsed modules**: Instead of re-parsing `dst_module_code` after modifications, the optimized version conditionally reuses the already-parsed `dst_module` when no global statements need insertion, avoiding unnecessary `cst.parse_module()` calls.

3. **Early termination**: Added an early return when `new_collector.assignments` is empty, avoiding the expensive `GlobalAssignmentTransformer` creation and visitation when there's nothing to transform.

4. **Minor optimization in uniqueness check**: Added a fast-path identity check (`stmt is existing_stmt`) before the expensive `deep_equals()` comparison, though this has minimal impact.

**Performance impact by test case type:**
- **Empty/minimal cases**: Show the highest gains (59-88% faster) due to early termination optimizations
- **Standard cases**: Achieve consistent 20-30% improvements from reduced parsing
- **Large-scale tests**: Benefit significantly (18-23% faster) as parsing overhead scales with code size

The optimization is most effective for workloads with moderate to large code files where CST parsing dominates the runtime, as evidenced by the original profiler showing 70%+ of time spent in `cst.parse_module()` and `module.visit()` operations.
Signed-off-by: Saurabh Misra <[email protected]>
…25-08-25T18.50.33

⚡️ Speed up function `add_global_assignments` by 18% in PR #683 (`fix/duplicate-global-assignments-when-reverting-helpers`)
…cs-in-diff

[Lsp] return diff functions grouped by file
* lsp: get new/modified functions inside a git commit

* better name

* refactor

* revert
* save optimization patches metadata

* typo

* lsp: get previous optimizations

* fix patch name in non-lsp mode

* ⚡️ Speed up function `get_patches_metadata` by 45% in PR #690 (`worktree/persist-optimization-patches`)

The optimized code achieves a **44% speedup** through two key optimizations:

**1. Added `@lru_cache(maxsize=1)` to `get_patches_dir_for_project()`**
- This caches the Path object construction, avoiding repeated calls to `get_git_project_id()` and `Path()` creation
- The line profiler shows this function's total time dropped from 5.32ms to being completely eliminated from the hot path in `get_patches_metadata()`
- Since `get_git_project_id()` was already cached but still being called repeatedly, this second-level caching eliminates that redundancy

**2. Replaced `read_text()` + `json.loads()` with `open()` + `json.load()`**
- Using `json.load()` with a file handle is more efficient than reading the entire file into memory first with `read_text()` then parsing it
- This avoids the intermediate string creation and is particularly beneficial for larger JSON files
- Added explicit UTF-8 encoding for consistency

**Performance Impact by Test Type:**
- **Basic cases** (small/missing files): 45-65% faster - benefits primarily from the caching optimization
- **Edge cases** (malformed JSON): 38-47% faster - still benefits from both optimizations  
- **Large scale cases** (1000+ patches, large files): 39-52% faster - the file I/O optimization becomes more significant with larger JSON files

The caching optimization provides the most consistent gains across all scenarios since it eliminates repeated expensive operations, while the file I/O optimization scales with file size.

* fix: patch path

* codeflash suggestions

* split the worktree utils in a separate file

---------

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Saga4 and others added 24 commits September 22, 2025 15:55
* LSP reduce no of candidates

* config revert

* pass reference values to aiservices

* line profiling loading msg

---------

Co-authored-by: saga4 <[email protected]>
Co-authored-by: ali <[email protected]>
* LSP reduce no of candidates

* config revert

* pass reference values to aiservices

* fix inline condition

---------

Co-authored-by: saga4 <[email protected]>
Signed-off-by: Saurabh Misra <[email protected]>
apscheduler tries to schedule jobs when the interpreter is shutting down which can cause it to crash and leave us in a bad state
The optimized version eliminates recursive function calls by replacing the recursive `_find` helper with an iterative approach. This provides significant performance benefits:

**Key Optimizations:**

1. **Removed Recursion Overhead**: The original code used a recursive helper function `_find` that created new stack frames for each parent traversal. The optimized version uses a simple iterative loop that traverses parents sequentially without function call overhead.

2. **Eliminated Function Creation**: The original code defined the `_find` function on every call to `find_target_node`. The optimized version removes this repeated function definition entirely.

3. **Early Exit with for-else**: The optimized code uses Python's `for-else` construct to immediately return `None` when a parent class isn't found, avoiding unnecessary continued searching.

4. **Reduced Attribute Access**: By caching `function_to_optimize.function_name` in a local variable `target_name` and reusing `body` variables, the code reduces repeated attribute lookups.

**Performance Impact by Test Case:**
- **Simple cases** (top-level functions, basic class methods): 23-62% faster due to eliminated recursion overhead
- **Nested class scenarios**: 45-84% faster, with deeper nesting showing greater improvements as recursion elimination has more impact
- **Large-scale tests**: 12-22% faster, showing consistent benefits even with many nodes to traverse
- **Edge cases** (empty modules, non-existent classes): 52-76% faster due to more efficient early termination

The optimization is particularly effective for deeply nested class hierarchies where the original recursive approach created multiple stack frames, while the iterative version maintains constant memory usage regardless of nesting depth.
…25-09-25T14.28.58

⚡️ Speed up function `find_target_node` by 18% in PR #763 (`fix/correctly-find-funtion-node-when-reverting-helpers`)
…node-when-reverting-helpers

[FIX] Respect parent classes in revert helpers
…d move other merged test below; finish resolving aiservice/config/explanation/function_optimizer; regenerate uv.lock
The optimized code achieves an 11% speedup through several targeted micro-optimizations that reduce attribute lookups and function call overhead in the hot path:

**Key Optimizations:**

1. **Local Variable Caching**: Frequently accessed attributes (`self.context_stack`, `self.original_runtimes`, etc.) are stored in local variables at function start, eliminating repeated `self.` lookups in tight loops.

2. **Method Reference Caching**: `self.get_comment` is cached as `get_comment`, and `context_stack.append`/`pop` are stored locally, reducing method lookup overhead in the main processing loop.

3. **Type Tuple Pre-computation**: The commonly used type tuples for `isinstance` checks are stored in local variables (`_stmt_types`, `_node_stmt_assign`), avoiding tuple creation on every iteration.

4. **Optimized Node Collection**: The inefficient pattern of creating a list then extending it (`nodes_to_check = [...]; nodes_to_check.extend(...)`) is replaced with conditional unpacking (`[compound_line_node, *body_attr]` or `[compound_line_node]`), reducing list operations.

5. **f-string Usage**: String concatenations for `inv_id` and `match_key` are converted to f-strings, which are faster than concatenation operations.

**Performance Characteristics:**
- Best gains on **large-scale test cases** (24.9-32.5% faster) with many nested blocks or statements, where the micro-optimizations compound
- Minimal overhead on **simple cases** (0.6-7.9% variance), showing the optimizations don't hurt baseline performance
- Most effective when processing complex ASTs with deep nesting, as seen in the `test_large_many_nested_blocks` (24.9% faster) and `test_large_sparse_runtime_keys` (32.5% faster) cases

The optimizations target the innermost loops where attribute lookups and object creation happen most frequently, making them particularly effective for batch AST processing workflows.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Sep 26, 2025
@KRRT7 KRRT7 force-pushed the standalone-fto-async branch from 40c4108 to 7bbb1e7 Compare September 26, 2025 20:26
@codeflash-ai codeflash-ai bot closed this Sep 27, 2025
@codeflash-ai
Copy link
Contributor Author

codeflash-ai bot commented Sep 27, 2025

This PR has been automatically closed because the original PR #678 by KRRT7 was closed.

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr678-2025-09-26T19.48.07 branch September 27, 2025 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants