Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Sep 26, 2025

⚡️ This pull request contains optimizations for PR #678

If you approve this dependent PR, these changes will be merged into the original PR branch standalone-fto-async.

This PR will be automatically closed if the original PR is merged.


📄 11% (0.11x) speedup for CommentMapper.visit_AsyncFunctionDef in codeflash/code_utils/edit_generated_tests.py

⏱️ Runtime : 5.28 milliseconds 4.76 milliseconds (best of 117 runs)

📝 Explanation and details

The optimized code achieves a 10% speedup through several key micro-optimizations that reduce overhead in the performance-critical loops:

Key Optimizations:

  1. Hoisted repeated attribute lookups: Variables like node_body = node.body, original_runtimes = self.original_runtimes, and results = self.results are cached once outside the loops instead of being accessed repeatedly via self. attribute lookups.

  2. Cached type objects and method references: isinstance_stmt = ast.stmt, isinstance_control = (ast.With, ast.For, ast.While, ast.If), and get_comment = self.get_comment eliminate repeated global/attribute lookups in the hot loops.

  3. Improved string formatting: Replaced string concatenation (str(i) + "_" + str(j)) with f-string formatting (f"{i}_{j}") which is more efficient in Python.

  4. Optimized getattr usage: Changed getattr(compound_line_node, "body", []) to getattr(compound_line_node, "body", None) with a conditional check, avoiding list creation when no body exists.

Why it's faster: The profiler shows the main performance bottleneck is in the nested loops processing control flow statements. By eliminating repetitive attribute lookups and method calls that happen thousands of times (2,729 iterations in the outer loop, 708 in nested loops), the optimization reduces per-iteration overhead.

Test case performance: The optimizations show the biggest gains on large-scale test cases with many statements (9-22% faster) and mixed control blocks, while having minimal impact on simple cases with few statements (often slightly slower due to setup overhead).

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 46 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import ast
from pathlib import Path

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.edit_generated_tests import CommentMapper


class GeneratedTests:
    def __init__(self, behavior_file_path):
        self.behavior_file_path = Path(behavior_file_path)


# Helper function to parse async function source and return ast.AsyncFunctionDef node
def get_async_func_node(src: str) -> ast.AsyncFunctionDef:
    module = ast.parse(src)
    for node in module.body:
        if isinstance(node, ast.AsyncFunctionDef):
            return node
    raise ValueError("No AsyncFunctionDef found in source")


# ------------------- UNIT TESTS -------------------

# 1. Basic Test Cases

def test_basic_single_async_function_no_body():
    # Async function with empty body
    src = "async def foo():\n    pass"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    original = {}
    optimized = {}
    mapper = CommentMapper(test, original, optimized)
    codeflash_output = mapper.visit_AsyncFunctionDef(node); result_node = codeflash_output # 5.59μs -> 6.00μs (6.85% slower)

def test_basic_single_async_function_with_body():
    # Async function with simple statements
    src = "async def foo():\n    x = 1\n    y = 2"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    # Prepare runtime dicts for both lines
    key_base = "foo#testfile#"  # abs_path will be 'testfile'
    original = {key_base + "0": 100, key_base + "1": 200}
    optimized = {key_base + "0": 50, key_base + "1": 100}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 13.0μs -> 13.4μs (2.85% slower)

def test_basic_async_function_with_if_for():
    # Async function with control flow
    src = ("async def foo():\n"
           "    x = 1\n"
           "    if x:\n"
           "        y = 2\n"
           "    for i in range(2):\n"
           "        z = i\n")
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#"
    # Prepare keys for main body, if, and for statements
    original = {
        key_base + "0": 100,  # x = 1
        key_base + "1_0": 200,  # y = 2 (in if)
        key_base + "2_0": 300   # z = i (in for)
    }
    optimized = {
        key_base + "0": 50,
        key_base + "1_0": 100,
        key_base + "2_0": 150
    }
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 16.3μs -> 16.4μs (0.974% slower)
    # Should have comments for x = 1, y = 2, z = i
    found = [False, False, False]
    for k, v in mapper.results.items():
        if "# 100ns -> 50ns" in v: found[0] = True
        if "# 200ns -> 100ns" in v: found[1] = True
        if "# 300ns -> 150ns" in v: found[2] = True

# 2. Edge Test Cases

def test_edge_no_matching_keys():
    # No runtime keys match any statement
    src = "async def foo():\n    x = 1\n    y = 2"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    original = {"foo#testfile#99": 100}
    optimized = {"foo#testfile#99": 50}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 5.89μs -> 6.25μs (5.77% slower)

def test_edge_missing_optimized_runtime():
    # Only original runtime present
    src = "async def foo():\n    x = 1"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#0"
    original = {key_base: 100}
    optimized = {}  # missing optimized
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 5.24μs -> 5.67μs (7.58% slower)

def test_edge_missing_original_runtime():
    # Only optimized runtime present
    src = "async def foo():\n    x = 1"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#0"
    original = {}
    optimized = {key_base: 50}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 4.87μs -> 5.45μs (10.7% slower)

def test_edge_nested_control_flow():
    # Deeply nested control flow
    src = (
        "async def foo():\n"
        "    x = 1\n"
        "    if True:\n"
        "        for i in range(2):\n"
        "            with open('test.txt') as f:\n"
        "                y = f.read()\n"
    )
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#"
    # Prepare keys for x = 1, y = f.read()
    original = {
        key_base + "0": 100,  # x = 1
        key_base + "1_0_0_0": 200,  # y = f.read()
    }
    optimized = {
        key_base + "0": 50,
        key_base + "1_0_0_0": 100,
    }
    # Note: The nested inv_id will not be found by the implementation above, which only goes 2 levels deep.
    # So only x = 1 will be matched.
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 12.7μs -> 12.8μs (0.624% slower)

def test_edge_async_function_with_no_statements():
    # Async function with only docstring
    src = 'async def foo():\n    """docstring"""'
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    original = {}
    optimized = {}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 5.08μs -> 5.35μs (5.07% slower)

def test_edge_async_function_with_assign_and_expr():
    # Async function with assignment and expression (not an assign)
    src = "async def foo():\n    x = 1\n    print(x)"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#"
    original = {key_base + "0": 100}
    optimized = {key_base + "0": 50}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 10.3μs -> 10.3μs (0.678% slower)

# 3. Large Scale Test Cases

def test_large_many_statements():
    # Async function with many statements (up to 1000)
    lines = ["    x{} = {}".format(i, i) for i in range(1000)]
    src = "async def foo():\n" + "\n".join(lines)
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#"
    # Prepare runtime dicts for all lines
    original = {key_base + str(i): i + 1000 for i in range(1000)}
    optimized = {key_base + str(i): i for i in range(1000)}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 1.78ms -> 1.61ms (10.6% faster)
    # Check a few random indices for correctness
    for idx in [0, 10, 500, 999]:
        lineno = node.body[idx].lineno
        comment = mapper.results[lineno]

def test_large_many_control_flow_blocks():
    # Async function with many 'if' blocks
    src = "async def foo():\n"
    for i in range(100):
        src += f"    if True:\n        x{i} = {i}\n"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#"
    # Prepare runtime dicts for all if blocks
    original = {key_base + f"{i}_0": i + 100 for i in range(100)}
    optimized = {key_base + f"{i}_0": i for i in range(100)}
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 218μs -> 189μs (14.8% faster)
    for i in range(100):
        # Find the assignment node inside the if block
        assign_node = node.body[i].body[0]
        lineno = assign_node.lineno
        comment = mapper.results[lineno]

def test_large_performance_scaling():
    # Async function with 500 assignments and 500 for loops each with assignment
    src = "async def foo():\n"
    for i in range(500):
        src += f"    x{i} = {i}\n"
    for i in range(500):
        src += f"    for j in range(1):\n        y{i} = j\n"
    node = get_async_func_node(src)
    test = GeneratedTests("testfile.py")
    key_base = "foo#testfile#"
    # Prepare runtime dicts for all assignments and for loop assignments
    original = {key_base + f"{i}": i + 2000 for i in range(500)}
    optimized = {key_base + f"{i}": i for i in range(500)}
    for i in range(500, 1000):
        original[key_base + f"{i}_0"] = i + 2000
        optimized[key_base + f"{i}_0"] = i
    mapper = CommentMapper(test, original, optimized)
    mapper.visit_AsyncFunctionDef(node) # 2.04ms -> 1.82ms (12.4% faster)
    # Check a few random indices for correctness
    for idx in [0, 499, 500, 999]:
        if idx < 500:
            lineno = node.body[idx].lineno
            comment = mapper.results[lineno]
        else:
            assign_node = node.body[idx].body[0]
            lineno = assign_node.lineno
            comment = mapper.results[lineno]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import ast
from pathlib import Path

# imports
import pytest  # used for our unit tests
from codeflash.code_utils.edit_generated_tests import CommentMapper


class GeneratedTests:
    def __init__(self, behavior_file_path: Path):
        self.behavior_file_path = behavior_file_path

# Helper to parse async function source and return ast.AsyncFunctionDef node
def get_async_func_node(source: str) -> ast.AsyncFunctionDef:
    mod = ast.parse(source)
    for node in mod.body:
        if isinstance(node, ast.AsyncFunctionDef):
            return node
    raise ValueError("No AsyncFunctionDef found in source.")

# Helper to create match keys for a given function name, path, and body indices
def make_key(func_name, path, inv_id):
    return f"{func_name}#{path}#{inv_id}"

# ========== Basic Test Cases ==========

def test_single_async_function_single_statement():
    # Test: basic async function with a single statement
    src = "async def foo():\n    x = 1\n"
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key = make_key("foo", path.with_suffix(""), "0")
    orig = {key: 1000000}
    opt = {key: 500000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 10.7μs -> 11.6μs (8.21% slower)

def test_async_function_multiple_statements():
    # Test: async function with multiple statements
    src = "async def foo():\n    x = 1\n    y = 2\n"
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key1 = make_key("foo", path.with_suffix(""), "0")
    key2 = make_key("foo", path.with_suffix(""), "1")
    orig = {key1: 2000000, key2: 3000000}
    opt = {key1: 1000000, key2: 1500000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 12.1μs -> 12.7μs (4.42% slower)

def test_async_function_with_if():
    # Test: async function with an if statement containing assignments
    src = (
        "async def foo():\n"
        "    if True:\n"
        "        x = 1\n"
        "        y = 2\n"
        "    z = 3\n"
    )
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key_if_x = make_key("foo", path.with_suffix(""), "0_0")
    key_if_y = make_key("foo", path.with_suffix(""), "0_1")
    key_z = make_key("foo", path.with_suffix(""), "1")
    orig = {key_if_x: 100000, key_if_y: 200000, key_z: 300000}
    opt = {key_if_x: 50000, key_if_y: 100000, key_z: 150000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 17.4μs -> 17.3μs (0.874% faster)

# ========== Edge Test Cases ==========

def test_async_function_no_body():
    # Test: async function with empty body
    src = "async def foo():\n    pass\n"
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    mapper = CommentMapper(test, {}, {})
    mapper.visit_AsyncFunctionDef(node) # 5.64μs -> 5.86μs (3.74% slower)

def test_async_function_missing_runtimes():
    # Test: async function where some statements lack runtime info
    src = "async def foo():\n    x = 1\n    y = 2\n"
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key1 = make_key("foo", path.with_suffix(""), "0")
    # Only x = 1 has runtime info
    orig = {key1: 1000000}
    opt = {key1: 500000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 10.6μs -> 10.5μs (0.763% faster)

def test_async_function_nested_for_and_if():
    # Test: async function with nested for and if blocks
    src = (
        "async def foo():\n"
        "    for i in range(2):\n"
        "        if i:\n"
        "            x = i\n"
        "        y = i\n"
        "    z = 3\n"
    )
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    # for: i=0, i=1; if: i=0, i=1
    key_for_if_x = make_key("foo", path.with_suffix(""), "0_0")
    key_for_y = make_key("foo", path.with_suffix(""), "0_1")
    key_z = make_key("foo", path.with_suffix(""), "1")
    orig = {key_for_if_x: 1000, key_for_y: 2000, key_z: 3000}
    opt = {key_for_if_x: 500, key_for_y: 1000, key_z: 1500}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 18.9μs -> 19.1μs (0.630% slower)

def test_async_function_performance_slower():
    # Test: performance is slower after optimization
    src = "async def foo():\n    x = 1\n"
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key = make_key("foo", path.with_suffix(""), "0")
    orig = {key: 100000}
    opt = {key: 200000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 9.22μs -> 9.67μs (4.66% slower)

def test_async_function_zero_original_runtime():
    # Test: original runtime is zero (edge case)
    src = "async def foo():\n    x = 1\n"
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key = make_key("foo", path.with_suffix(""), "0")
    orig = {key: 0}
    opt = {key: 1000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 8.85μs -> 9.54μs (7.26% slower)

def test_async_function_with_with_block():
    # Test: async function with a with block containing assignments
    src = (
        "async def foo():\n"
        "    with open('file') as f:\n"
        "        x = f.read()\n"
        "    y = 2\n"
    )
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    key_with_x = make_key("foo", path.with_suffix(""), "0_0")
    key_y = make_key("foo", path.with_suffix(""), "1")
    orig = {key_with_x: 10000, key_y: 20000}
    opt = {key_with_x: 5000, key_y: 10000}
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 13.8μs -> 13.6μs (1.11% faster)

# ========== Large Scale Test Cases ==========

def test_large_async_function_many_statements():
    # Test: async function with many statements
    N = 500
    src_lines = ["async def foo():\n"]
    for i in range(N):
        src_lines.append(f"    x{i} = {i}\n")
    src = "".join(src_lines)
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    orig = {}
    opt = {}
    for i in range(N):
        key = make_key("foo", path.with_suffix(""), str(i))
        orig[key] = 1000 * (i + 1)
        opt[key] = 500 * (i + 1)
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 929μs -> 852μs (9.05% faster)
    # All lines from 2 to N+1 should have comments
    for lineno in range(2, N + 2):
        # Check correct formatting for a few random lines
        if lineno in [2, N//2, N+1]:
            i = lineno - 2
            expected = f"# {orig[make_key('foo', path.with_suffix(''), str(i))]/1e6:.2f}ms -> {opt[make_key('foo', path.with_suffix(''), str(i))]/1e6:.2f}ms (50.00% faster)"


def test_large_async_function_mixed_blocks():
    # Test: async function with a mix of for, if, and with blocks, many statements
    N = 100
    src_lines = ["async def foo():\n"]
    for i in range(N):
        src_lines.append(f"    for j{i} in range(1):\n")
        src_lines.append(f"        if True:\n")
        src_lines.append(f"            with open('file') as f:\n")
        src_lines.append(f"                x{i} = f.read()\n")
    src = "".join(src_lines)
    node = get_async_func_node(src)
    path = Path("/tmp/abc.py")
    test = GeneratedTests(path)
    orig = {}
    opt = {}
    for i in range(N):
        key = make_key("foo", path.with_suffix(""), f"{i}_0_0")
        orig[key] = 10000 * (i + 1)
        opt[key] = 5000 * (i + 1)
    mapper = CommentMapper(test, orig, opt)
    mapper.visit_AsyncFunctionDef(node) # 126μs -> 103μs (22.3% faster)
    # Each x{i} assignment is at line 4*i+2
    for i in range(N):
        lineno = 4 * i + 5
        expected = f"# {orig[make_key('foo', path.with_suffix(''), f'{i}_0_0')]/1e6:.2f}ms -> {opt[make_key('foo', path.with_suffix(''), f'{i}_0_0')]/1e6:.2f}ms (50.00% faster)"
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr678-2025-09-26T19.52.34 and push.

Codeflash

mohammedahmed18 and others added 30 commits August 22, 2025 05:58
[LSP] Ensure optimizer cleanup on server shutdown or when the client suddenly disconnects
…licate-global-assignments-when-reverting-helpers
…/duplicate-global-assignments-when-reverting-helpers`)

The optimized code achieves a **17% speedup** by eliminating redundant CST parsing operations, which are the most expensive parts of the function according to the line profiler.

**Key optimizations:**

1. **Eliminate duplicate parsing**: The original code parsed `src_module_code` and `dst_module_code` multiple times. The optimized version introduces `_extract_global_statements_once()` that parses each module only once and reuses the parsed CST objects throughout the function.

2. **Reuse parsed modules**: Instead of re-parsing `dst_module_code` after modifications, the optimized version conditionally reuses the already-parsed `dst_module` when no global statements need insertion, avoiding unnecessary `cst.parse_module()` calls.

3. **Early termination**: Added an early return when `new_collector.assignments` is empty, avoiding the expensive `GlobalAssignmentTransformer` creation and visitation when there's nothing to transform.

4. **Minor optimization in uniqueness check**: Added a fast-path identity check (`stmt is existing_stmt`) before the expensive `deep_equals()` comparison, though this has minimal impact.

**Performance impact by test case type:**
- **Empty/minimal cases**: Show the highest gains (59-88% faster) due to early termination optimizations
- **Standard cases**: Achieve consistent 20-30% improvements from reduced parsing
- **Large-scale tests**: Benefit significantly (18-23% faster) as parsing overhead scales with code size

The optimization is most effective for workloads with moderate to large code files where CST parsing dominates the runtime, as evidenced by the original profiler showing 70%+ of time spent in `cst.parse_module()` and `module.visit()` operations.
Signed-off-by: Saurabh Misra <[email protected]>
…25-08-25T18.50.33

⚡️ Speed up function `add_global_assignments` by 18% in PR #683 (`fix/duplicate-global-assignments-when-reverting-helpers`)
…cs-in-diff

[Lsp] return diff functions grouped by file
* lsp: get new/modified functions inside a git commit

* better name

* refactor

* revert
* save optimization patches metadata

* typo

* lsp: get previous optimizations

* fix patch name in non-lsp mode

* ⚡️ Speed up function `get_patches_metadata` by 45% in PR #690 (`worktree/persist-optimization-patches`)

The optimized code achieves a **44% speedup** through two key optimizations:

**1. Added `@lru_cache(maxsize=1)` to `get_patches_dir_for_project()`**
- This caches the Path object construction, avoiding repeated calls to `get_git_project_id()` and `Path()` creation
- The line profiler shows this function's total time dropped from 5.32ms to being completely eliminated from the hot path in `get_patches_metadata()`
- Since `get_git_project_id()` was already cached but still being called repeatedly, this second-level caching eliminates that redundancy

**2. Replaced `read_text()` + `json.loads()` with `open()` + `json.load()`**
- Using `json.load()` with a file handle is more efficient than reading the entire file into memory first with `read_text()` then parsing it
- This avoids the intermediate string creation and is particularly beneficial for larger JSON files
- Added explicit UTF-8 encoding for consistency

**Performance Impact by Test Type:**
- **Basic cases** (small/missing files): 45-65% faster - benefits primarily from the caching optimization
- **Edge cases** (malformed JSON): 38-47% faster - still benefits from both optimizations  
- **Large scale cases** (1000+ patches, large files): 39-52% faster - the file I/O optimization becomes more significant with larger JSON files

The caching optimization provides the most consistent gains across all scenarios since it eliminates repeated expensive operations, while the file I/O optimization scales with file size.

* fix: patch path

* codeflash suggestions

* split the worktree utils in a separate file

---------

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Saga4 and others added 24 commits September 22, 2025 15:55
* LSP reduce no of candidates

* config revert

* pass reference values to aiservices

* line profiling loading msg

---------

Co-authored-by: saga4 <[email protected]>
Co-authored-by: ali <[email protected]>
* LSP reduce no of candidates

* config revert

* pass reference values to aiservices

* fix inline condition

---------

Co-authored-by: saga4 <[email protected]>
Signed-off-by: Saurabh Misra <[email protected]>
apscheduler tries to schedule jobs when the interpreter is shutting down which can cause it to crash and leave us in a bad state
The optimized version eliminates recursive function calls by replacing the recursive `_find` helper with an iterative approach. This provides significant performance benefits:

**Key Optimizations:**

1. **Removed Recursion Overhead**: The original code used a recursive helper function `_find` that created new stack frames for each parent traversal. The optimized version uses a simple iterative loop that traverses parents sequentially without function call overhead.

2. **Eliminated Function Creation**: The original code defined the `_find` function on every call to `find_target_node`. The optimized version removes this repeated function definition entirely.

3. **Early Exit with for-else**: The optimized code uses Python's `for-else` construct to immediately return `None` when a parent class isn't found, avoiding unnecessary continued searching.

4. **Reduced Attribute Access**: By caching `function_to_optimize.function_name` in a local variable `target_name` and reusing `body` variables, the code reduces repeated attribute lookups.

**Performance Impact by Test Case:**
- **Simple cases** (top-level functions, basic class methods): 23-62% faster due to eliminated recursion overhead
- **Nested class scenarios**: 45-84% faster, with deeper nesting showing greater improvements as recursion elimination has more impact
- **Large-scale tests**: 12-22% faster, showing consistent benefits even with many nodes to traverse
- **Edge cases** (empty modules, non-existent classes): 52-76% faster due to more efficient early termination

The optimization is particularly effective for deeply nested class hierarchies where the original recursive approach created multiple stack frames, while the iterative version maintains constant memory usage regardless of nesting depth.
…25-09-25T14.28.58

⚡️ Speed up function `find_target_node` by 18% in PR #763 (`fix/correctly-find-funtion-node-when-reverting-helpers`)
…node-when-reverting-helpers

[FIX] Respect parent classes in revert helpers
…d move other merged test below; finish resolving aiservice/config/explanation/function_optimizer; regenerate uv.lock
The optimized code achieves a 10% speedup through several key micro-optimizations that reduce overhead in the performance-critical loops:

**Key Optimizations:**

1. **Hoisted repeated attribute lookups**: Variables like `node_body = node.body`, `original_runtimes = self.original_runtimes`, and `results = self.results` are cached once outside the loops instead of being accessed repeatedly via `self.` attribute lookups.

2. **Cached type objects and method references**: `isinstance_stmt = ast.stmt`, `isinstance_control = (ast.With, ast.For, ast.While, ast.If)`, and `get_comment = self.get_comment` eliminate repeated global/attribute lookups in the hot loops.

3. **Improved string formatting**: Replaced string concatenation (`str(i) + "_" + str(j)`) with f-string formatting (`f"{i}_{j}"`) which is more efficient in Python.

4. **Optimized getattr usage**: Changed `getattr(compound_line_node, "body", [])` to `getattr(compound_line_node, "body", None)` with a conditional check, avoiding list creation when no body exists.

**Why it's faster**: The profiler shows the main performance bottleneck is in the nested loops processing control flow statements. By eliminating repetitive attribute lookups and method calls that happen thousands of times (2,729 iterations in the outer loop, 708 in nested loops), the optimization reduces per-iteration overhead.

**Test case performance**: The optimizations show the biggest gains on large-scale test cases with many statements (9-22% faster) and mixed control blocks, while having minimal impact on simple cases with few statements (often slightly slower due to setup overhead).
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Sep 26, 2025
@KRRT7 KRRT7 force-pushed the standalone-fto-async branch from 40c4108 to 7bbb1e7 Compare September 26, 2025 20:26
@codeflash-ai codeflash-ai bot closed this Sep 27, 2025
@codeflash-ai
Copy link
Contributor Author

codeflash-ai bot commented Sep 27, 2025

This PR has been automatically closed because the original PR #678 by KRRT7 was closed.

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr678-2025-09-26T19.52.34 branch September 27, 2025 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants