⚡️ Speed up method `CommentMapper.visit_FunctionDef` by 11% in PR #678 (`standalone-fto-async`) #766

codeflash-ai · 2025-09-26T19:48:13Z

⚡️ This pull request contains optimizations for PR #678

If you approve this dependent PR, these changes will be merged into the original PR branch standalone-fto-async.

This PR will be automatically closed if the original PR is merged.

📄 11% (0.11x) speedup for `CommentMapper.visit_FunctionDef` in `codeflash/code_utils/edit_generated_tests.py`

⏱️ Runtime : 2.62 milliseconds → 2.36 milliseconds (best of 295 runs)

📝 Explanation and details

The optimized code achieves an 11% speedup through several targeted micro-optimizations that reduce attribute lookups and function call overhead in the hot path:

Key Optimizations:

Local Variable Caching: Frequently accessed attributes (self.context_stack, self.original_runtimes, etc.) are stored in local variables at function start, eliminating repeated self. lookups in tight loops.
Method Reference Caching: self.get_comment is cached as get_comment, and context_stack.append/pop are stored locally, reducing method lookup overhead in the main processing loop.
Type Tuple Pre-computation: The commonly used type tuples for isinstance checks are stored in local variables (_stmt_types, _node_stmt_assign), avoiding tuple creation on every iteration.
Optimized Node Collection: The inefficient pattern of creating a list then extending it (nodes_to_check = [...]; nodes_to_check.extend(...)) is replaced with conditional unpacking ([compound_line_node, *body_attr] or [compound_line_node]), reducing list operations.
f-string Usage: String concatenations for inv_id and match_key are converted to f-strings, which are faster than concatenation operations.

Performance Characteristics:

Best gains on large-scale test cases (24.9-32.5% faster) with many nested blocks or statements, where the micro-optimizations compound
Minimal overhead on simple cases (0.6-7.9% variance), showing the optimizations don't hurt baseline performance
Most effective when processing complex ASTs with deep nesting, as seen in the test_large_many_nested_blocks (24.9% faster) and test_large_sparse_runtime_keys (32.5% faster) cases

The optimizations target the innermost loops where attribute lookups and object creation happen most frequently, making them particularly effective for batch AST processing workflows.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 60 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from __future__ import annotations

import ast

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.edit_generated_tests import CommentMapper
from codeflash.code_utils.time_utils import format_perf, format_time
from codeflash.models.models import GeneratedTests
from codeflash.result.critic import performance_gain


# Helper classes and functions for testing
class DummyPath:
    """A dummy Path-like object for testing .with_suffix()."""
    def __init__(self, val):
        self.val = val
    def with_suffix(self, suffix=""):
        # Always return self for simplicity
        return self
    def __str__(self):
        return self.val

class DummyTest:
    """A dummy GeneratedTests-like object for testing."""
    def __init__(self, path):
        self.behavior_file_path = DummyPath(path)

# Helper to parse code and get FunctionDef node
def get_funcdef_node(code: str) -> ast.FunctionDef:
    tree = ast.parse(code)
    for node in tree.body:
        if isinstance(node, ast.FunctionDef):
            return node
    raise ValueError("No FunctionDef found")

# Helper to simulate runtime dicts
def make_runtime_dict(key_base, inv_ids, orig, opt):
    """Create runtime dicts with keys for inv_ids and values for orig/opt."""
    orig_dict = {}
    opt_dict = {}
    for inv_id, o, p in zip(inv_ids, orig, opt):
        k = f"{key_base}#{inv_id}"
        orig_dict[k] = o
        opt_dict[k] = p
    return orig_dict, opt_dict

# --- Basic Test Cases ---

def test_basic_single_statement():
    """Test a simple function with one statement."""
    code = "def foo():\n    x = 1"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "foo#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0"], [100], [50])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 7.92μs -> 8.32μs (4.82% slower)
    comment = mapper.results[2]

def test_basic_multiple_statements():
    """Test a function with multiple statements."""
    code = "def bar():\n    a = 1\n    b = 2\n    c = 3"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "bar#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["2", "1", "0"], [10, 20, 30], [8, 18, 25])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 12.0μs -> 11.9μs (0.763% faster)
    # Should have results for lines 2, 3, 4
    for lineno in [2, 3, 4]:
        comment = mapper.results[lineno]

def test_basic_with_control_flow():
    """Test a function with an if statement and assignments inside."""
    code = (
        "def baz():\n"
        "    if True:\n"
        "        x = 1\n"
        "        y = 2\n"
        "    z = 3"
    )
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "baz#file.py"
    # inv_ids: "0_1" for y=2, "0_0" for x=1, "1" for z=3
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0_1", "0_0", "1"], [100, 200, 300], [90, 180, 250])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 13.6μs -> 13.0μs (4.86% faster)
    # Should have results for lines 3, 4, 5
    for lineno in [3, 4, 5]:
        comment = mapper.results[lineno]

# --- Edge Test Cases ---

def test_edge_no_body():
    """Test a function with no body."""
    code = "def empty():\n    pass"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "empty#file.py"
    # No runtime keys
    orig_dict, opt_dict = {}, {}
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 2.89μs -> 3.14μs (7.97% slower)

def test_edge_missing_runtime_keys():
    """Test when runtime keys are missing for some statements."""
    code = "def partial():\n    a = 1\n    b = 2"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "partial#file.py"
    # Only provide runtime for the first statement
    orig_dict, opt_dict = make_runtime_dict(key_base, ["1"], [10], [5])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 7.80μs -> 7.85μs (0.624% slower)

def test_edge_nested_control_flow():
    """Test a function with nested control flow (if inside for)."""
    code = (
        "def nested():\n"
        "    for i in range(2):\n"
        "        if i:\n"
        "            x = i\n"
        "        y = i\n"
        "    z = 0"
    )
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "nested#file.py"
    # inv_ids: "0_1_0" for x=i, "0_1_1" for y=i, "1" for z=0
    orig_dict = {
        f"{key_base}#0_1_0": 50,
        f"{key_base}#0_1_1": 60,
        f"{key_base}#1": 70,
    }
    opt_dict = {
        f"{key_base}#0_1_0": 45,
        f"{key_base}#0_1_1": 55,
        f"{key_base}#1": 65,
    }
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 10.8μs -> 10.4μs (3.95% faster)
    # Should have results for lines 4, 5, 7
    for lineno in [4, 5, 7]:
        pass

def test_edge_async_function():
    """Test that visit_FunctionDef ignores async functions."""
    code = "async def afunc():\n    a = 1"
    tree = ast.parse(code)
    node = tree.body[0]
    test = DummyTest("file.py")
    key_base = "afunc#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0"], [10], [5])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    # Should not process AsyncFunctionDef with visit_FunctionDef
    # So results should be empty
    mapper.visit_FunctionDef(node) # 6.76μs -> 7.14μs (5.33% slower)

def test_edge_non_assign_statement():
    """Test that non-assign statements are skipped."""
    code = "def skip():\n    print('hi')\n    x = 2"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "skip#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["1"], [10], [5])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 7.55μs -> 7.69μs (1.82% slower)

def test_edge_slower_runtime():
    """Test that 'slower' status is shown when optimized_time > original_time."""
    code = "def slow():\n    x = 1"
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "slow#file.py"
    orig_dict, opt_dict = make_runtime_dict(key_base, ["0"], [10], [15])
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 6.67μs -> 6.96μs (4.15% slower)

# --- Large Scale Test Cases ---

def test_large_many_statements():
    """Test performance and correctness with a function with many statements."""
    N = 500  # Not exceeding 1000
    code_lines = ["def big():"] + [f"    x{i} = {i}" for i in range(N)]
    code = "\n".join(code_lines)
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "big#file.py"
    inv_ids = [str(i) for i in reversed(range(N))]
    orig = [i*10 for i in range(N)]
    opt = [i*8 for i in range(N)]
    orig_dict, opt_dict = make_runtime_dict(key_base, inv_ids, orig, opt)
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 937μs -> 857μs (9.23% faster)
    # Should have N results, all lines from 2 to N+1
    for lineno in range(2, N+2):
        pass

def test_large_many_nested_blocks():
    """Test performance with many nested control blocks."""
    N = 50  # Not exceeding 1000 total statements
    code_lines = ["def nest():", "    for i in range(1):"]
    for j in range(N):
        code_lines.append(f"        if i == {j}:")
        code_lines.append(f"            x{j} = {j}")
    code = "\n".join(code_lines)
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "nest#file.py"
    inv_ids = [f"0_{j}_0" for j in range(N)]
    orig = [j*100 for j in range(N)]
    opt = [j*80 for j in range(N)]
    orig_dict, opt_dict = make_runtime_dict(key_base, inv_ids, orig, opt)
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 49.1μs -> 39.3μs (24.9% faster)
    # Should have N results for lines 4,6,...,2*N+2
    for idx, lineno in enumerate(range(4, 2*N+4, 2)):
        pass

def test_large_sparse_runtime_keys():
    """Test with many statements but only a few runtime keys present."""
    N = 200
    code_lines = ["def sparse():"] + [f"    x{i} = {i}" for i in range(N)]
    code = "\n".join(code_lines)
    func_node = get_funcdef_node(code)
    test = DummyTest("file.py")
    key_base = "sparse#file.py"
    # Only every 20th statement has runtime info
    inv_ids = [str(i) for i in reversed(range(N)) if i % 20 == 0]
    orig = [i*100 for i in range(N) if i % 20 == 0]
    opt = [i*90 for i in range(N) if i % 20 == 0]
    orig_dict, opt_dict = make_runtime_dict(key_base, inv_ids, orig, opt)
    mapper = CommentMapper(test, orig_dict, opt_dict)
    mapper.visit_FunctionDef(func_node) # 99.5μs -> 75.1μs (32.5% faster)
    # Only lines 2,22,42,... should have results
    for i in range(0, N, 20):
        lineno = i+2
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import ast
from pathlib import Path

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.edit_generated_tests import CommentMapper


class GeneratedTests:
    def __init__(self, behavior_file_path: Path):
        self.behavior_file_path = behavior_file_path

# The function to test is visit_FunctionDef, as a method of CommentMapper above

# Helper to build a CommentMapper with test data
def make_mapper(
    func_name: str,
    original_runtimes: dict[str, int],
    optimized_runtimes: dict[str, int],
    path: str = "myfile.py",
) -> CommentMapper:
    test = GeneratedTests(behavior_file_path=Path(path))
    return CommentMapper(test, original_runtimes, optimized_runtimes)

# Helper to parse code and get the FunctionDef node
def get_funcdef_node(src: str) -> ast.FunctionDef:
    mod = ast.parse(src)
    for node in mod.body:
        if isinstance(node, ast.FunctionDef):
            return node
    raise ValueError("No FunctionDef found")

# -------------------- UNIT TESTS --------------------

# 1. Basic Test Cases

def test_single_statement_function():
    # Function with one statement, runtime data present
    src = "def foo():\n    x = 1"
    node = get_funcdef_node(src)
    # Key: foo#myfile#0
    orig = {"foo#myfile#0": 1000}
    opt = {"foo#myfile#0": 500}
    mapper = make_mapper("foo", orig, opt)
    mapper.visit_FunctionDef(node) # 10.8μs -> 11.2μs (3.50% slower)
    comment = mapper.results[2]

def test_multiple_statements_function():
    # Function with two statements, both present in runtime data
    src = "def bar():\n    x = 1\n    y = 2"
    node = get_funcdef_node(src)
    orig = {"bar#myfile#0": 2000, "bar#myfile#1": 1000}
    opt = {"bar#myfile#0": 1000, "bar#myfile#1": 500}
    mapper = make_mapper("bar", orig, opt)
    mapper.visit_FunctionDef(node) # 13.4μs -> 13.4μs (0.374% faster)

def test_no_runtime_data():
    # Function with statements, but no runtime data
    src = "def baz():\n    x = 1\n    y = 2"
    node = get_funcdef_node(src)
    orig = {}
    opt = {}
    mapper = make_mapper("baz", orig, opt)
    mapper.visit_FunctionDef(node) # 5.93μs -> 5.78μs (2.59% faster)

def test_with_block():
    # Function with a with block, runtime data for inner statement
    src = "def qux():\n    with open('f') as f:\n        x = f.read()"
    node = get_funcdef_node(src)
    # Key: qux#myfile#0_0
    orig = {"qux#myfile#0_0": 4000}
    opt = {"qux#myfile#0_0": 2000}
    mapper = make_mapper("qux", orig, opt)
    mapper.visit_FunctionDef(node) # 11.7μs -> 11.4μs (2.27% faster)

def test_for_loop_block():
    # Function with a for loop, runtime data for inner statement
    src = "def loop():\n    for i in range(2):\n        x = i"
    node = get_funcdef_node(src)
    orig = {"loop#myfile#0_0": 3000}
    opt = {"loop#myfile#0_0": 1000}
    mapper = make_mapper("loop", orig, opt)
    mapper.visit_FunctionDef(node) # 11.5μs -> 11.0μs (4.73% faster)

# 2. Edge Test Cases

def test_empty_function():
    # Function with no body
    src = "def empty():\n    pass"
    node = get_funcdef_node(src)
    orig = {}
    opt = {}
    mapper = make_mapper("empty", orig, opt)
    mapper.visit_FunctionDef(node) # 5.17μs -> 5.14μs (0.584% faster)

def test_missing_optimized_runtime():
    # Function with original runtime but missing optimized
    src = "def foo():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {"foo#myfile#0": 1000}
    opt = {}
    mapper = make_mapper("foo", orig, opt)
    mapper.visit_FunctionDef(node) # 5.16μs -> 5.35μs (3.53% slower)

def test_missing_original_runtime():
    # Function with optimized runtime but missing original
    src = "def foo():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {}
    opt = {"foo#myfile#0": 500}
    mapper = make_mapper("foo", orig, opt)
    mapper.visit_FunctionDef(node) # 5.07μs -> 5.24μs (3.23% slower)

def test_nested_blocks():
    # Function with nested blocks (if inside for)
    src = (
        "def nest():\n"
        "    for i in range(2):\n"
        "        if i:\n"
        "            x = i\n"
    )
    node = get_funcdef_node(src)
    # Key: nest#myfile#0_0_0
    orig = {"nest#myfile#0_0_0": 5000}
    opt = {"nest#myfile#0_0_0": 2500}
    mapper = make_mapper("nest", orig, opt)
    mapper.visit_FunctionDef(node) # 7.16μs -> 6.92μs (3.47% faster)

def test_slower_performance():
    # Function where optimized runtime is slower
    src = "def slow():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {"slow#myfile#0": 1000}
    opt = {"slow#myfile#0": 2000}
    mapper = make_mapper("slow", orig, opt)
    mapper.visit_FunctionDef(node) # 10.7μs -> 10.8μs (0.380% slower)

def test_zero_original_runtime():
    # Original runtime is zero, should not crash
    src = "def zero():\n    x = 1"
    node = get_funcdef_node(src)
    orig = {"zero#myfile#0": 0}
    opt = {"zero#myfile#0": 0}
    mapper = make_mapper("zero", orig, opt)
    mapper.visit_FunctionDef(node) # 8.87μs -> 9.11μs (2.64% slower)

def test_non_assign_stmt_in_block():
    # Compound block with a non-assign statement (e.g., Expr)
    src = "def expr():\n    for i in range(2):\n        print(i)"
    node = get_funcdef_node(src)
    orig = {"expr#myfile#0_0": 6000}
    opt = {"expr#myfile#0_0": 3000}
    mapper = make_mapper("expr", orig, opt)
    mapper.visit_FunctionDef(node) # 11.9μs -> 11.3μs (5.43% faster)

def test_multiple_blocks():
    # Function with multiple compound blocks
    src = (
        "def multi():\n"
        "    for i in range(2):\n"
        "        x = i\n"
        "    while True:\n"
        "        y = 1\n"
    )
    node = get_funcdef_node(src)
    orig = {
        "multi#myfile#0_0": 1000,
        "multi#myfile#1_0": 2000,
    }
    opt = {
        "multi#myfile#0_0": 500,
        "multi#myfile#1_0": 1000,
    }
    mapper = make_mapper("multi", orig, opt)
    mapper.visit_FunctionDef(node) # 15.7μs -> 15.1μs (3.57% faster)

def test_body_with_else():
    # If block with else, both branches
    src = (
        "def branch():\n"
        "    if True:\n"
        "        x = 1\n"
        "    else:\n"
        "        y = 2\n"
    )
    node = get_funcdef_node(src)
    orig = {
        "branch#myfile#0_0": 1000,
        "branch#myfile#0_1": 2000,
    }
    opt = {
        "branch#myfile#0_0": 500,
        "branch#myfile#0_1": 1000,
    }
    mapper = make_mapper("branch", orig, opt)
    mapper.visit_FunctionDef(node) # 11.2μs -> 11.0μs (1.91% faster)

def test_body_with_if_and_assign():
    # If block with assign in body and else
    src = (
        "def if_assign():\n"
        "    if True:\n"
        "        x = 1\n"
        "    else:\n"
        "        y = 2\n"
        "    z = 3\n"
    )
    node = get_funcdef_node(src)
    orig = {
        "if_assign#myfile#0_0": 1000,
        "if_assign#myfile#0_1": 2000,
        "if_assign#myfile#1": 3000,
    }
    opt = {
        "if_assign#myfile#0_0": 500,
        "if_assign#myfile#0_1": 1000,
        "if_assign#myfile#1": 1500,
    }
    mapper = make_mapper("if_assign", orig, opt)
    mapper.visit_FunctionDef(node) # 14.6μs -> 14.3μs (2.60% faster)

# 3. Large Scale Test Cases

def test_large_function_body():
    # Function with 500 statements
    src_lines = ["def bigfunc():"]
    for i in range(500):
        src_lines.append(f"    x{i} = {i}")
    src = "\n".join(src_lines)
    node = get_funcdef_node(src)
    orig = {f"bigfunc#myfile#{i}": 1000 + i for i in range(500)}
    opt = {f"bigfunc#myfile#{i}": 500 + i for i in range(500)}
    mapper = make_mapper("bigfunc", orig, opt)
    mapper.visit_FunctionDef(node) # 937μs -> 843μs (11.1% faster)
    # All lines should have comments
    for lineno in range(2, 502):
        pass

def test_large_nested_blocks():
    # Function with 100 for loops, each with one assign
    src_lines = ["def manyloops():"]
    for i in range(100):
        src_lines.append(f"    for j{i} in range(2):")
        src_lines.append(f"        x{i} = j{i}")
    src = "\n".join(src_lines)
    node = get_funcdef_node(src)
    orig = {f"manyloops#myfile#{i}_0": 2000 + i for i in range(100)}
    opt = {f"manyloops#myfile#{i}_0": 1000 + i for i in range(100)}
    mapper = make_mapper("manyloops", orig, opt)
    mapper.visit_FunctionDef(node) # 264μs -> 238μs (11.0% faster)
    # Each loop's inner assign should have comment
    for i in range(100):
        lineno = 2 + i * 2

def test_large_mixed_blocks():
    # Function with alternating for/while/if blocks, up to 50 blocks
    src_lines = ["def mixed():"]
    for i in range(50):
        src_lines.append(f"    for k{i} in range(2):")
        src_lines.append(f"        if k{i} % 2 == 0:")
        src_lines.append(f"            x{i} = k{i}")
        src_lines.append(f"        else:")
        src_lines.append(f"            y{i} = k{i}")
        src_lines.append(f"    while True:")
        src_lines.append(f"        z{i} = k{i}")
    src = "\n".join(src_lines)
    node = get_funcdef_node(src)
    orig = {}
    opt = {}
    for i in range(50):
        orig[f"mixed#myfile#{i}_0_0"] = 1000 + i
        opt[f"mixed#myfile#{i}_0_0"] = 500 + i
        orig[f"mixed#myfile#{i}_0_1"] = 2000 + i
        opt[f"mixed#myfile#{i}_0_1"] = 1000 + i
        orig[f"mixed#myfile#{i}_1_0"] = 3000 + i
        opt[f"mixed#myfile#{i}_1_0"] = 1500 + i
    mapper = make_mapper("mixed", orig, opt)
    mapper.visit_FunctionDef(node) # 110μs -> 83.5μs (32.3% faster)
    # Each block's inner assigns should have comments
    for i in range(50):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr678-2025-09-26T19.48.07 and push.

[LSP] Ensure optimizer cleanup on server shutdown or when the client suddenly disconnects

…licate-global-assignments-when-reverting-helpers

…/duplicate-global-assignments-when-reverting-helpers`) The optimized code achieves a **17% speedup** by eliminating redundant CST parsing operations, which are the most expensive parts of the function according to the line profiler. **Key optimizations:** 1. **Eliminate duplicate parsing**: The original code parsed `src_module_code` and `dst_module_code` multiple times. The optimized version introduces `_extract_global_statements_once()` that parses each module only once and reuses the parsed CST objects throughout the function. 2. **Reuse parsed modules**: Instead of re-parsing `dst_module_code` after modifications, the optimized version conditionally reuses the already-parsed `dst_module` when no global statements need insertion, avoiding unnecessary `cst.parse_module()` calls. 3. **Early termination**: Added an early return when `new_collector.assignments` is empty, avoiding the expensive `GlobalAssignmentTransformer` creation and visitation when there's nothing to transform. 4. **Minor optimization in uniqueness check**: Added a fast-path identity check (`stmt is existing_stmt`) before the expensive `deep_equals()` comparison, though this has minimal impact. **Performance impact by test case type:** - **Empty/minimal cases**: Show the highest gains (59-88% faster) due to early termination optimizations - **Standard cases**: Achieve consistent 20-30% improvements from reduced parsing - **Large-scale tests**: Benefit significantly (18-23% faster) as parsing overhead scales with code size The optimization is most effective for workloads with moderate to large code files where CST parsing dominates the runtime, as evidenced by the original profiler showing 70%+ of time spent in `cst.parse_module()` and `module.visit()` operations.

Signed-off-by: Saurabh Misra <[email protected]>

…25-08-25T18.50.33 ⚡️ Speed up function `add_global_assignments` by 18% in PR #683 (`fix/duplicate-global-assignments-when-reverting-helpers`)

…cs-in-diff [Lsp] return diff functions grouped by file

* lsp: get new/modified functions inside a git commit * better name * refactor * revert

* save optimization patches metadata * typo * lsp: get previous optimizations * fix patch name in non-lsp mode * ⚡️ Speed up function `get_patches_metadata` by 45% in PR #690 (`worktree/persist-optimization-patches`) The optimized code achieves a **44% speedup** through two key optimizations: **1. Added `@lru_cache(maxsize=1)` to `get_patches_dir_for_project()`** - This caches the Path object construction, avoiding repeated calls to `get_git_project_id()` and `Path()` creation - The line profiler shows this function's total time dropped from 5.32ms to being completely eliminated from the hot path in `get_patches_metadata()` - Since `get_git_project_id()` was already cached but still being called repeatedly, this second-level caching eliminates that redundancy **2. Replaced `read_text()` + `json.loads()` with `open()` + `json.load()`** - Using `json.load()` with a file handle is more efficient than reading the entire file into memory first with `read_text()` then parsing it - This avoids the intermediate string creation and is particularly beneficial for larger JSON files - Added explicit UTF-8 encoding for consistency **Performance Impact by Test Type:** - **Basic cases** (small/missing files): 45-65% faster - benefits primarily from the caching optimization - **Edge cases** (malformed JSON): 38-47% faster - still benefits from both optimizations - **Large scale cases** (1000+ patches, large files): 39-52% faster - the file I/O optimization becomes more significant with larger JSON files The caching optimization provides the most consistent gains across all scenarios since it eliminates repeated expensive operations, while the file I/O optimization scales with file size. * fix: patch path * codeflash suggestions * split the worktree utils in a separate file --------- Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>

Deque Comparator

* LSP reduce no of candidates * config revert * pass reference values to aiservices * line profiling loading msg --------- Co-authored-by: saga4 <[email protected]> Co-authored-by: ali <[email protected]>

* LSP reduce no of candidates * config revert * pass reference values to aiservices * fix inline condition --------- Co-authored-by: saga4 <[email protected]>

import variable correctly

Signed-off-by: Saurabh Misra <[email protected]>

support attrs comparison

apscheduler tries to schedule jobs when the interpreter is shutting down which can cause it to crash and leave us in a bad state

patch apscheduler when tracing

The optimized version eliminates recursive function calls by replacing the recursive `_find` helper with an iterative approach. This provides significant performance benefits: **Key Optimizations:** 1. **Removed Recursion Overhead**: The original code used a recursive helper function `_find` that created new stack frames for each parent traversal. The optimized version uses a simple iterative loop that traverses parents sequentially without function call overhead. 2. **Eliminated Function Creation**: The original code defined the `_find` function on every call to `find_target_node`. The optimized version removes this repeated function definition entirely. 3. **Early Exit with for-else**: The optimized code uses Python's `for-else` construct to immediately return `None` when a parent class isn't found, avoiding unnecessary continued searching. 4. **Reduced Attribute Access**: By caching `function_to_optimize.function_name` in a local variable `target_name` and reusing `body` variables, the code reduces repeated attribute lookups. **Performance Impact by Test Case:** - **Simple cases** (top-level functions, basic class methods): 23-62% faster due to eliminated recursion overhead - **Nested class scenarios**: 45-84% faster, with deeper nesting showing greater improvements as recursion elimination has more impact - **Large-scale tests**: 12-22% faster, showing consistent benefits even with many nodes to traverse - **Edge cases** (empty modules, non-existent classes): 52-76% faster due to more efficient early termination The optimization is particularly effective for deeply nested class hierarchies where the original recursive approach created multiple stack frames, while the iterative version maintains constant memory usage regardless of nesting depth.

…25-09-25T14.28.58 ⚡️ Speed up function `find_target_node` by 18% in PR #763 (`fix/correctly-find-funtion-node-when-reverting-helpers`)

…node-when-reverting-helpers [FIX] Respect parent classes in revert helpers

Granular async instrumentation

…d move other merged test below; finish resolving aiservice/config/explanation/function_optimizer; regenerate uv.lock

The optimized code achieves an 11% speedup through several targeted micro-optimizations that reduce attribute lookups and function call overhead in the hot path: **Key Optimizations:** 1. **Local Variable Caching**: Frequently accessed attributes (`self.context_stack`, `self.original_runtimes`, etc.) are stored in local variables at function start, eliminating repeated `self.` lookups in tight loops. 2. **Method Reference Caching**: `self.get_comment` is cached as `get_comment`, and `context_stack.append`/`pop` are stored locally, reducing method lookup overhead in the main processing loop. 3. **Type Tuple Pre-computation**: The commonly used type tuples for `isinstance` checks are stored in local variables (`_stmt_types`, `_node_stmt_assign`), avoiding tuple creation on every iteration. 4. **Optimized Node Collection**: The inefficient pattern of creating a list then extending it (`nodes_to_check = [...]; nodes_to_check.extend(...)`) is replaced with conditional unpacking (`[compound_line_node, *body_attr]` or `[compound_line_node]`), reducing list operations. 5. **f-string Usage**: String concatenations for `inv_id` and `match_key` are converted to f-strings, which are faster than concatenation operations. **Performance Characteristics:** - Best gains on **large-scale test cases** (24.9-32.5% faster) with many nested blocks or statements, where the micro-optimizations compound - Minimal overhead on **simple cases** (0.6-7.9% variance), showing the optimizations don't hurt baseline performance - Most effective when processing complex ASTs with deep nesting, as seen in the `test_large_many_nested_blocks` (24.9% faster) and `test_large_sparse_runtime_keys` (32.5% faster) cases The optimizations target the innermost loops where attribute lookups and object creation happen most frequently, making them particularly effective for batch AST processing workflows.

codeflash-ai · 2025-09-27T00:16:38Z

This PR has been automatically closed because the original PR #678 by KRRT7 was closed.

mohammedahmed18 and others added 30 commits August 22, 2025 05:58

move optimizer cleanup into server and run optimization in thread

ce79864

for safety

0afb67c

Merge branch 'main' into lsp/threaded-optimizer-cleanup

85ccaaa

prevent duplicate global assignments when reverting helpers

e7de51a

test: simplify

117e682

Merge branch 'main' into lsp/threaded-optimizer-cleanup

c5fbe09

prevent duplicates for new global statements

9c8256a

Merge pull request #676 from codeflash-ai/lsp/threaded-optimizer-cleanup

674e69e

[LSP] Ensure optimizer cleanup on server shutdown or when the client suddenly disconnects

send the file paths with the functions in the current diff

55cb7f4

Merge branch 'main' of github.com:codeflash-ai/codeflash into fix/dup…

28f50cc

…licate-global-assignments-when-reverting-helpers

Merge branch 'main' into standalone-fto-async

9c05130

better name

b18d2c9

revert comment

90ebac3

wip

936fa0a

Signed-off-by: Saurabh Misra <[email protected]>

Merge branch 'main' into parallel-pytest-tracing

0e7834e

bugfix

78519ee

cleanup

6446662

Merge pull request #686 from codeflash-ai/codeflash/optimize-pr683-20…

c1f75ad

…25-08-25T18.50.33 ⚡️ Speed up function `add_global_assignments` by 18% in PR #683 (`fix/duplicate-global-assignments-when-reverting-helpers`)

Merge pull request #688 from codeflash-ai/lsp/send-filenames-with-fun…

fdaf6c0

…cs-in-diff [Lsp] return diff functions grouped by file

Merge branch 'main' into standalone-fto-async

95a149b

[LSP] Get new/modified functions inside a git commit (#694)

a59b9ed

* lsp: get new/modified functions inside a git commit * better name * refactor * revert

debug measure time

d4788b9

should work, will write some tests

48e367e

basic tests

d0195a9

tests for all collections objects

b7a52bc

Merge pull request #705 from codeflash-ai/deque-comparator

8753e54

Deque Comparator

sets instead of lists

0dc325a

fix: regex for api key shell export

97f658c

Saga4 and others added 24 commits September 22, 2025 15:55

LSP reduce no of candidates (#729)

078b5c0

* LSP reduce no of candidates * config revert * pass reference values to aiservices * line profiling loading msg --------- Co-authored-by: saga4 <[email protected]> Co-authored-by: ali <[email protected]>

[fix] candidate inline and remove functions for config (#746)

cdd16bb

* LSP reduce no of candidates * config revert * pass reference values to aiservices * fix inline condition --------- Co-authored-by: saga4 <[email protected]>

Update test_runner.py

51ceda7

Update test_runner.py

5a425c3

Update test_runner.py

4c8b76f

Merge pull request #747 from codeflash-ai/fix_looping_issue

443cb4d

import variable correctly

release/v0.17.0 (#756)

32e85ee

Signed-off-by: Saurabh Misra <[email protected]>

attrs

bebe856

remove unused import

febe0ec

Merge pull request #757 from codeflash-ai/attrs-support

5dffd2a

support attrs comparison

Update tracing_new_process.py

647531f

apscheduler tries to schedule jobs when the interpreter is shutting down which can cause it to crash and leave us in a bad state

code review

444ff12

linter

36e3868

Merge pull request #761 from codeflash-ai/prevent-deadlocks-pytest

f296a0f

patch apscheduler when tracing

respect parents when searching for function in revert helpers

9d57cea

new line

27419be

async function

5cd9024

Merge pull request #764 from codeflash-ai/codeflash/optimize-pr763-20…

76123d0

…25-09-25T14.28.58 ⚡️ Speed up function `find_target_node` by 18% in PR #763 (`fix/correctly-find-funtion-node-when-reverting-helpers`)

Merge pull request #763 from codeflash-ai/fix/correctly-find-funtion-…

2d886e8

…node-when-reverting-helpers [FIX] Respect parent classes in revert helpers

Merge pull request #687 from codeflash-ai/granular-async-instrumentation

1e103bd

Granular async instrumentation

Resolve merge conflicts: consolidate async tests into single block an…

f5c98e5

…d move other merged test below; finish resolving aiservice/config/explanation/function_optimizer; regenerate uv.lock

oopsie mergie

40c4108

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Sep 26, 2025

codeflash-ai bot mentioned this pull request Sep 26, 2025

check if a function is async and add to FTO #678

Closed

KRRT7 force-pushed the standalone-fto-async branch from 40c4108 to 7bbb1e7 Compare September 26, 2025 20:26

codeflash-ai bot closed this Sep 27, 2025

codeflash-ai bot deleted the codeflash/optimize-pr678-2025-09-26T19.48.07 branch September 27, 2025 00:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `CommentMapper.visit_FunctionDef` by 11% in PR #678 (`standalone-fto-async`) #766

⚡️ Speed up method `CommentMapper.visit_FunctionDef` by 11% in PR #678 (`standalone-fto-async`) #766

Uh oh!

codeflash-ai bot commented Sep 26, 2025

Uh oh!

codeflash-ai bot commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

⚡️ Speed up method CommentMapper.visit_FunctionDef by 11% in PR #678 (standalone-fto-async) #766

⚡️ Speed up method CommentMapper.visit_FunctionDef by 11% in PR #678 (standalone-fto-async) #766

Uh oh!

Conversation

codeflash-ai bot commented Sep 26, 2025

⚡️ This pull request contains optimizations for PR #678

📄 11% (0.11x) speedup for CommentMapper.visit_FunctionDef in codeflash/code_utils/edit_generated_tests.py

📝 Explanation and details

Uh oh!

codeflash-ai bot commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

⚡️ Speed up method `CommentMapper.visit_FunctionDef` by 11% in PR #678 (`standalone-fto-async`) #766

⚡️ Speed up method `CommentMapper.visit_FunctionDef` by 11% in PR #678 (`standalone-fto-async`) #766

📄 11% (0.11x) speedup for `CommentMapper.visit_FunctionDef` in `codeflash/code_utils/edit_generated_tests.py`