Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Oct 21, 2025

⚡️ This pull request contains optimizations for PR #841

If you approve this dependent PR, these changes will be merged into the original PR branch fix-detect-unused-helper-functions.

This PR will be automatically closed if the original PR is merged.


📄 36% (0.36x) speedup for detect_unused_helper_functions in codeflash/context/unused_definition_remover.py

⏱️ Runtime : 37.2 milliseconds 27.5 milliseconds (best of 5 runs)

📝 Explanation and details

Key optimizations made:

  • Avoided building unnecessary generator lists during AST traversal by reducing reliance on ast.walk in _analyze_imports_in_optimized_code, switching to a manual stack-based visitor that only scans body attributes, which is all that's needed for import analysis.
  • Cached local variable lookups where they are inside loops for reduced global lookup overhead.
  • Used set.isdisjoint for checking if helper names are unused, which is faster (short-circuits) than set intersection then if not ....
  • Used in-place .add and .update to the called_function_names set to save attribute/method lookup costs.
  • Multiple small memory and speed optimizations by flattening variable accesses and minimizing unnecessary structure copying.

All comments, signatures, and behaviors are preserved and the code structure is unchanged unless a change was necessary for optimization.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 5 Passed
🌀 Generated Regression Tests 50 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 98.2%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_unused_helper_revert.py::test_async_class_methods 261μs 214μs 21.9%✅
test_unused_helper_revert.py::test_async_entrypoint_with_async_helpers 187μs 165μs 13.2%✅
test_unused_helper_revert.py::test_async_generators_and_coroutines 365μs 261μs 39.6%✅
test_unused_helper_revert.py::test_class_method_calls_external_helper_functions 208μs 175μs 18.3%✅
test_unused_helper_revert.py::test_class_method_entrypoint_with_helper_methods 227μs 197μs 15.0%✅
test_unused_helper_revert.py::test_detect_unused_helper_functions 188μs 158μs 19.0%✅
test_unused_helper_revert.py::test_detect_unused_in_multi_file_project 193μs 173μs 11.8%✅
test_unused_helper_revert.py::test_mixed_sync_and_async_helpers 320μs 239μs 33.7%✅
test_unused_helper_revert.py::test_module_dot_function_import_style 185μs 171μs 8.24%✅
test_unused_helper_revert.py::test_multi_file_import_styles 258μs 232μs 10.9%✅
test_unused_helper_revert.py::test_nested_class_method_optimization 191μs 150μs 27.0%✅
test_unused_helper_revert.py::test_no_unused_helpers_no_revert 200μs 176μs 13.4%✅
test_unused_helper_revert.py::test_recursive_helper_function_not_detected_as_unused 162μs 142μs 14.4%✅
test_unused_helper_revert.py::test_static_method_and_class_method 255μs 197μs 29.4%✅
test_unused_helper_revert.py::test_sync_entrypoint_with_async_helpers 231μs 174μs 32.4%✅
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import ast
# Patch logger
import sys
from collections import defaultdict
from itertools import chain
from pathlib import Path
from types import SimpleNamespace
from typing import Optional

# imports
import pytest
from codeflash.cli_cmds.console import logger
from codeflash.context.unused_definition_remover import \
    detect_unused_helper_functions
from codeflash.discovery.functions_to_optimize import FunctionToOptimize
from codeflash.models.models import (CodeOptimizationContext,
                                     CodeStringsMarkdown, FunctionSource)


# Minimal FunctionSource mock
class FunctionSource:
    def __init__(self, only_function_name, qualified_name=None, fully_qualified_name=None, file_path=None, jedi_definition=None):
        self.only_function_name = only_function_name
        self.qualified_name = qualified_name or only_function_name
        self.fully_qualified_name = fully_qualified_name or only_function_name
        self.file_path = file_path or Path("main.py")
        self.jedi_definition = jedi_definition or SimpleNamespace(type="function")

    def __repr__(self):
        return f"FunctionSource({self.only_function_name})"

    def __eq__(self, other):
        if not isinstance(other, FunctionSource):
            return False
        return (
            self.only_function_name == other.only_function_name and
            self.qualified_name == other.qualified_name and
            self.fully_qualified_name == other.fully_qualified_name and
            self.file_path == other.file_path
        )

    def __hash__(self):
        return hash((self.only_function_name, self.qualified_name, self.fully_qualified_name, self.file_path))

# Minimal FunctionToOptimize mock
class FunctionToOptimize:
    def __init__(self, function_name, file_path=None, parents=None):
        self.function_name = function_name
        self.file_path = file_path or Path("main.py")
        self.parents = parents or []

# Minimal CodeOptimizationContext mock
class CodeOptimizationContext:
    def __init__(self, helper_functions):
        self.helper_functions = helper_functions

# Minimal CodeStringsMarkdown mock
class CodeStringsMarkdown:
    def __init__(self, code_strings):
        self.code_strings = code_strings
from codeflash.context.unused_definition_remover import \
    detect_unused_helper_functions

# --- Unit Tests ---

# 1. Basic Test Cases

def test_no_helpers_returns_empty():
    """No helper functions: should return empty list."""
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext([])
    code = "def main_func(): pass"
    codeflash_output = detect_unused_helper_functions(fto, ctx, code) # 51.6μs -> 52.7μs (2.18% slower)

def test_all_helpers_used():
    """All helpers are used: should return empty list."""
    helpers = [
        FunctionSource("helper1"),
        FunctionSource("helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
def main_func():
    helper1()
    helper2()
def helper1(): pass
def helper2(): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code) # 113μs -> 100μs (13.1% faster)

def test_some_helpers_unused():
    """Some helpers are unused: should return only unused ones."""
    helpers = [
        FunctionSource("helper1"),
        FunctionSource("helper2"),
        FunctionSource("helper3"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
def main_func():
    helper1()
def helper1(): pass
def helper2(): pass
def helper3(): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 108μs -> 92.2μs (17.3% faster)

def test_helper_called_within_another_helper():
    """Helper called only by another helper, not directly by entrypoint, is unused."""
    helpers = [
        FunctionSource("helper1"),
        FunctionSource("helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
def main_func():
    helper1()
def helper1():
    helper2()
def helper2(): pass
"""
    # Only helper1 is called from entrypoint; helper2 is not called directly
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 100μs -> 84.0μs (19.0% faster)

def test_class_method_helpers():
    """Helpers defined as class methods, called via self."""
    class_parent = SimpleNamespace(name="MyClass")
    helpers = [
        FunctionSource("helper1", qualified_name="MyClass.helper1"),
        FunctionSource("helper2", qualified_name="MyClass.helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"), parents=[class_parent])
    ctx = CodeOptimizationContext(helpers)
    code = """
class MyClass:
    def main_func(self):
        self.helper1()
    def helper1(self): pass
    def helper2(self): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 121μs -> 103μs (17.6% faster)

def test_entrypoint_not_found_returns_empty():
    """Entrypoint function not found: should return empty list."""
    helpers = [FunctionSource("helper1")]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = "def other_func(): pass"
    codeflash_output = detect_unused_helper_functions(fto, ctx, code) # 22.1μs -> 21.0μs (5.54% faster)

# 2. Edge Test Cases

def test_helper_with_same_name_as_builtin():
    """Helper function has same name as a builtin; only counted if called."""
    helpers = [FunctionSource("print")]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
def main_func():
    pass
def print(): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 66.1μs -> 59.7μs (10.7% faster)

def test_helper_called_as_attribute_of_imported_module():
    """Helper is in another file and called as module.function()."""
    helpers = [
        FunctionSource("helper1", file_path=Path("utils.py"), qualified_name="helper1", fully_qualified_name="utils.helper1"),
        FunctionSource("helper2", file_path=Path("utils.py"), qualified_name="helper2", fully_qualified_name="utils.helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
import utils
def main_func():
    utils.helper1()
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 102μs -> 92.8μs (10.3% faster)

def test_helper_called_via_from_import():
    """Helper is imported via 'from utils import helper1' and called directly."""
    helpers = [
        FunctionSource("helper1", file_path=Path("utils.py"), qualified_name="helper1", fully_qualified_name="utils.helper1"),
        FunctionSource("helper2", file_path=Path("utils.py"), qualified_name="helper2", fully_qualified_name="utils.helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
from utils import helper1
def main_func():
    helper1()
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 94.2μs -> 85.2μs (10.5% faster)

def test_helper_called_via_from_import_as():
    """Helper is imported as alias and called via alias."""
    helpers = [
        FunctionSource("helper1", file_path=Path("utils.py"), qualified_name="helper1", fully_qualified_name="utils.helper1"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
from utils import helper1 as h1
def main_func():
    h1()
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 85.0μs -> 76.0μs (11.9% faster)

def test_helper_called_via_import_as():
    """Helper is imported as module alias and called via alias.function()."""
    helpers = [
        FunctionSource("helper1", file_path=Path("utils.py"), qualified_name="helper1", fully_qualified_name="utils.helper1"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
import utils as u
def main_func():
    u.helper1()
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 86.4μs -> 75.6μs (14.3% faster)

def test_helper_with_non_function_type():
    """Helper is a class, should be ignored."""
    helpers = [
        FunctionSource("helper1", jedi_definition=SimpleNamespace(type="class")),
        FunctionSource("helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
def main_func():
    helper2()
def helper2(): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 78.3μs -> 67.2μs (16.5% faster)

def test_helper_called_as_method_of_other_object():
    """Helper called as obj.helper1() (not self), should match attr name."""
    helpers = [
        FunctionSource("helper1"),
        FunctionSource("helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = """
def main_func():
    x = SomeObj()
    x.helper1()
def helper1(): pass
def helper2(): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 117μs -> 98.4μs (19.4% faster)

def test_helper_called_within_nested_class():
    """Entrypoint is inside a nested class, helper called as self.method()."""
    class_parent = SimpleNamespace(name="Outer")
    inner_parent = SimpleNamespace(name="Inner")
    helpers = [
        FunctionSource("helper1", qualified_name="Inner.helper1"),
        FunctionSource("helper2", qualified_name="Inner.helper2"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"), parents=[class_parent, inner_parent])
    ctx = CodeOptimizationContext(helpers)
    code = """
class Outer:
    class Inner:
        def main_func(self):
            self.helper1()
        def helper1(self): pass
        def helper2(self): pass
"""
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 120μs -> 98.7μs (21.8% faster)

def test_multiple_code_strings_markdown():
    """CodeStringsMarkdown: should aggregate unused helpers from all code strings."""
    helpers = [
        FunctionSource("helper1"),
        FunctionSource("helper2"),
        FunctionSource("helper3"),
    ]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code1 = SimpleNamespace(code="def main_func(): helper1()\ndef helper1(): pass\ndef helper2(): pass")
    code2 = SimpleNamespace(code="def main_func(): helper3()\ndef helper3(): pass")
    md = CodeStringsMarkdown([code1, code2])
    codeflash_output = detect_unused_helper_functions(fto, ctx, md); unused = codeflash_output # 10.6μs -> 10.8μs (1.39% slower)

def test_invalid_code_returns_empty():
    """Invalid code should not raise, just return empty list."""
    helpers = [FunctionSource("helper1")]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    code = "def main_func(: pass"
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused = codeflash_output # 39.3μs -> 42.0μs (6.44% slower)

# 3. Large Scale Test Cases

def test_large_number_of_helpers_and_calls():
    """Large scale: 500 helpers, 250 used, 250 unused."""
    N = 500
    used = [FunctionSource(f"helper{i}") for i in range(250)]
    unused = [FunctionSource(f"helper{i}") for i in range(250, N)]
    helpers = used + unused
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    calls = "\n    ".join(f"helper{i}()" for i in range(250))
    defs = "\n".join(f"def helper{i}(): pass" for i in range(N))
    code = f"def main_func():\n    {calls}\n{defs}"
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused_result = codeflash_output # 8.57ms -> 6.04ms (42.0% faster)

def test_large_number_of_helpers_all_used():
    """Large scale: 500 helpers, all used."""
    N = 500
    helpers = [FunctionSource(f"helper{i}") for i in range(N)]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    calls = "\n    ".join(f"helper{i}()" for i in range(N))
    defs = "\n".join(f"def helper{i}(): pass" for i in range(N))
    code = f"def main_func():\n    {calls}\n{defs}"
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused_result = codeflash_output # 10.2ms -> 7.27ms (40.4% faster)

def test_large_number_of_helpers_none_used():
    """Large scale: 500 helpers, none used."""
    N = 500
    helpers = [FunctionSource(f"helper{i}") for i in range(N)]
    fto = FunctionToOptimize("main_func", Path("main.py"))
    ctx = CodeOptimizationContext(helpers)
    defs = "\n".join(f"def helper{i}(): pass" for i in range(N))
    code = f"def main_func():\n    pass\n{defs}"
    codeflash_output = detect_unused_helper_functions(fto, ctx, code); unused_result = codeflash_output # 5.72ms -> 4.00ms (43.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import ast
from pathlib import Path

# imports
import pytest
from codeflash.context.unused_definition_remover import \
    detect_unused_helper_functions

# --- Minimal stubs for required classes and objects ---

class DummyJediDefinition:
    def __init__(self, type_):
        self.type = type_

class DummyParent:
    def __init__(self, name):
        self.name = name

class FunctionSource:
    def __init__(
        self, only_function_name, qualified_name, fully_qualified_name, file_path, jedi_definition
    ):
        self.only_function_name = only_function_name
        self.qualified_name = qualified_name
        self.fully_qualified_name = fully_qualified_name
        self.file_path = file_path
        self.jedi_definition = jedi_definition

    def __eq__(self, other):
        if not isinstance(other, FunctionSource):
            return False
        return (
            self.only_function_name == other.only_function_name
            and self.qualified_name == other.qualified_name
            and self.fully_qualified_name == other.fully_qualified_name
            and self.file_path == other.file_path
        )

    def __repr__(self):
        return f"FunctionSource({self.only_function_name!r}, {self.qualified_name!r}, {self.fully_qualified_name!r}, {self.file_path!r})"

class FunctionToOptimize:
    def __init__(self, function_name, file_path, parents=None):
        self.function_name = function_name
        self.file_path = file_path
        self.parents = parents or []

class CodeOptimizationContext:
    def __init__(self, helper_functions):
        self.helper_functions = helper_functions

class CodeStringsMarkdown:
    def __init__(self, code_strings):
        self.code_strings = code_strings

    @property
    def code(self):
        # For compatibility with the main function
        return self.code_strings[0].code if self.code_strings else ""

class DummyCodeString:
    def __init__(self, code):
        self.code = code

# --- Function to test (copied from above, with logger stubbed out) ---

class DummyLogger:
    def debug(self, msg):
        pass

logger = DummyLogger()
from codeflash.context.unused_definition_remover import \
    detect_unused_helper_functions

# --- Unit tests ---

# Helper to create FunctionSource objects
def make_helper(name, module, type_="function"):
    return FunctionSource(
        only_function_name=name,
        qualified_name=f"{module}.{name}",
        fully_qualified_name=f"{module}.{name}",
        file_path=Path(f"{module}.py"),
        jedi_definition=DummyJediDefinition(type_)
    )

# Helper to create FunctionToOptimize objects
def make_entrypoint(name, module, parents=None):
    return FunctionToOptimize(
        function_name=name,
        file_path=Path(f"{module}.py"),
        parents=parents or []
    )

# --------------- BASIC TEST CASES ------------------

def test_basic_single_helper_used():
    # Entrypoint calls helper; helper should NOT be detected as unused
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    foo_helper()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 89.3μs -> 79.7μs (12.0% faster)

def test_basic_single_helper_unused():
    # Entrypoint does NOT call helper; helper should be detected as unused
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    pass
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 68.3μs -> 61.3μs (11.5% faster)

def test_basic_multiple_helpers_some_unused():
    # Entrypoint calls one helper, not the other
    helper1 = make_helper("foo_helper", "main")
    helper2 = make_helper("bar_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    foo_helper()
def foo_helper():
    pass
def bar_helper():
    pass
"""
    context = CodeOptimizationContext([helper1, helper2])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 95.9μs -> 83.2μs (15.3% faster)

def test_basic_class_method_helper_used():
    # Entrypoint is a class method, calls another method via self
    helper = make_helper("foo_helper", "main")
    parent = DummyParent("MyClass")
    entrypoint = make_entrypoint("main_func", "main", [parent])
    code = """
class MyClass:
    def main_func(self):
        self.foo_helper()
    def foo_helper(self):
        pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 100μs -> 85.8μs (16.6% faster)

def test_basic_class_method_helper_unused():
    # Entrypoint is a class method, does NOT call helper
    helper = make_helper("foo_helper", "main")
    parent = DummyParent("MyClass")
    entrypoint = make_entrypoint("main_func", "main", [parent])
    code = """
class MyClass:
    def main_func(self):
        pass
    def foo_helper(self):
        pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 79.1μs -> 68.5μs (15.4% faster)

def test_basic_imported_helper_used():
    # Entrypoint calls helper imported from another module
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
from helpers import foo_helper
def main_func():
    foo_helper()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 88.3μs -> 80.4μs (9.88% faster)

def test_basic_imported_helper_unused():
    # Entrypoint does not call imported helper
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
from helpers import foo_helper
def main_func():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 69.2μs -> 61.6μs (12.3% faster)

def test_basic_imported_module_helper_used():
    # Entrypoint calls helper via module.function
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
import helpers
def main_func():
    helpers.foo_helper()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 89.0μs -> 79.7μs (11.6% faster)

def test_basic_imported_module_helper_unused():
    # Entrypoint does not call helper from imported module
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
import helpers
def main_func():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 67.7μs -> 60.7μs (11.6% faster)

def test_basic_helper_called_with_alias():
    # Entrypoint calls helper imported as an alias
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
from helpers import foo_helper as fh
def main_func():
    fh()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 84.9μs -> 78.4μs (8.27% faster)

def test_basic_module_imported_as_alias():
    # Entrypoint calls helper via module alias
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
import helpers as hp
def main_func():
    hp.foo_helper()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 86.6μs -> 76.6μs (13.0% faster)

# --------------- EDGE TEST CASES ------------------

def test_edge_no_helpers():
    # No helpers in context; should return empty list
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    pass
"""
    context = CodeOptimizationContext([])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 45.1μs -> 40.9μs (10.3% faster)

def test_edge_entrypoint_not_found():
    # Entrypoint function not present in code
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 21.3μs -> 21.8μs (2.25% slower)

def test_edge_helper_is_class():
    # Helper is a class, not a function; should not be considered
    helper = make_helper("foo_helper", "main", type_="class")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    pass
class foo_helper:
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 51.5μs -> 46.8μs (10.1% faster)

def test_edge_helper_called_as_attribute_of_object():
    # Helper called as obj.foo_helper(), not self or module
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    obj = SomeClass()
    obj.foo_helper()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 103μs -> 87.1μs (18.8% faster)

def test_edge_helper_called_within_nested_function():
    # Helper called within a nested function inside entrypoint
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    def inner():
        foo_helper()
    inner()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 99.7μs -> 85.2μs (17.1% faster)

def test_edge_helper_called_within_loop():
    # Helper called inside a loop
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    for i in range(5):
        foo_helper()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 104μs -> 87.9μs (18.8% faster)

def test_edge_helper_called_within_if():
    # Helper called conditionally
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    if True:
        foo_helper()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 90.3μs -> 79.7μs (13.3% faster)

def test_edge_helper_called_with_different_name_in_import():
    # Helper imported as different name and called
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
from helpers import foo_helper as fh
def main_func():
    fh()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 84.6μs -> 74.6μs (13.5% faster)

def test_edge_helper_called_with_module_alias():
    # Helper called via module alias
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
import helpers as hp
def main_func():
    hp.foo_helper()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 87.0μs -> 78.6μs (10.7% faster)

def test_edge_helper_called_with_fully_qualified_name():
    # Helper called with fully qualified name
    helper = make_helper("foo_helper", "helpers")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
from helpers import foo_helper
def main_func():
    helpers.foo_helper()
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 86.5μs -> 74.8μs (15.6% faster)

def test_edge_helpers_with_same_name_in_different_modules():
    # Two helpers with same name in different modules; only one is called
    helper1 = make_helper("foo_helper", "helpers1")
    helper2 = make_helper("foo_helper", "helpers2")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
from helpers1 import foo_helper
def main_func():
    foo_helper()
"""
    context = CodeOptimizationContext([helper1, helper2])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 90.7μs -> 81.7μs (11.0% faster)

def test_edge_code_as_CodeStringsMarkdown():
    # Code passed as CodeStringsMarkdown with multiple code strings
    helper1 = make_helper("foo_helper", "main")
    helper2 = make_helper("bar_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code1 = DummyCodeString("""
def main_func():
    foo_helper()
def foo_helper():
    pass
def bar_helper():
    pass
""")
    code2 = DummyCodeString("""
def main_func():
    bar_helper()
def foo_helper():
    pass
def bar_helper():
    pass
""")
    code_md = CodeStringsMarkdown([code1, code2])
    context = CodeOptimizationContext([helper1, helper2])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code_md); unused = codeflash_output # 10.1μs -> 11.0μs (8.23% slower)

def test_edge_helper_called_in_async_function():
    # Entrypoint is async, calls helper
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
async def main_func():
    foo_helper()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 82.0μs -> 70.5μs (16.4% faster)

def test_edge_helper_called_within_try_except():
    # Helper called inside try block
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    try:
        foo_helper()
    except Exception:
        pass
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 99.9μs -> 86.8μs (15.1% faster)

def test_edge_helper_called_multiple_times():
    # Helper called multiple times in entrypoint
    helper = make_helper("foo_helper", "main")
    entrypoint = make_entrypoint("main_func", "main")
    code = """
def main_func():
    foo_helper()
    foo_helper()
def foo_helper():
    pass
"""
    context = CodeOptimizationContext([helper])
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 89.1μs -> 75.5μs (17.9% faster)

# --------------- LARGE SCALE TEST CASES ------------------

def test_large_scale_many_helpers_some_unused():
    # 100 helpers, only first 10 used
    helpers = [make_helper(f"helper_{i}", "main") for i in range(100)]
    entrypoint = make_entrypoint("main_func", "main")
    used_helpers = "\n".join([f"    helper_{i}()" for i in range(10)])
    helpers_def = "\n".join([f"def helper_{i}(): pass" for i in range(100)])
    code = f"""
def main_func():
{used_helpers}
{helpers_def}
"""
    context = CodeOptimizationContext(helpers)
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 1.36ms -> 1.00ms (35.6% faster)
    expected_unused = helpers[10:]

def test_large_scale_many_helpers_all_used():
    # 100 helpers, all used
    helpers = [make_helper(f"helper_{i}", "main") for i in range(100)]
    entrypoint = make_entrypoint("main_func", "main")
    used_helpers = "\n".join([f"    helper_{i}()" for i in range(100)])
    helpers_def = "\n".join([f"def helper_{i}(): pass" for i in range(100)])
    code = f"""
def main_func():
{used_helpers}
{helpers_def}
"""
    context = CodeOptimizationContext(helpers)
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 2.10ms -> 1.48ms (41.2% faster)

def test_large_scale_many_helpers_none_used():
    # 100 helpers, none used
    helpers = [make_helper(f"helper_{i}", "main") for i in range(100)]
    entrypoint = make_entrypoint("main_func", "main")
    helpers_def = "\n".join([f"def helper_{i}(): pass" for i in range(100)])
    code = f"""
def main_func():
    pass
{helpers_def}
"""
    context = CodeOptimizationContext(helpers)
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 1.22ms -> 869μs (40.2% faster)

def test_large_scale_helpers_across_multiple_modules():
    # 50 helpers in main, 50 in helpers; only some used
    helpers_main = [make_helper(f"helper_{i}", "main") for i in range(50)]
    helpers_helpers = [make_helper(f"helper_{i}", "helpers") for i in range(50)]
    entrypoint = make_entrypoint("main_func", "main")
    used_main = "\n".join([f"    helper_{i}()" for i in range(10)])
    used_helpers = "\n".join([f"    helpers.helper_{i}()" for i in range(10)])
    helpers_main_def = "\n".join([f"def helper_{i}(): pass" for i in range(50)])
    helpers_helpers_def = "\n".join([f"# helpers.helper_{i} is in other module" for i in range(50)])
    code = f"""
import helpers
def main_func():
{used_main}
{used_helpers}
{helpers_main_def}
{helpers_helpers_def}
"""
    context = CodeOptimizationContext(helpers_main + helpers_helpers)
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code); unused = codeflash_output # 1.22ms -> 981μs (24.1% faster)
    expected_unused = helpers_main[10:] + helpers_helpers[10:]

def test_large_scale_CodeStringsMarkdown():
    # 10 helpers, used in alternating code strings
    helpers = [make_helper(f"helper_{i}", "main") for i in range(10)]
    entrypoint = make_entrypoint("main_func", "main")
    code_strings = []
    for i in range(10):
        used = f"    helper_{i}()"
        helpers_def = "\n".join([f"def helper_{j}(): pass" for j in range(10)])
        code = f"""
def main_func():
{used}
{helpers_def}
"""
        code_strings.append(DummyCodeString(code))
    code_md = CodeStringsMarkdown(code_strings)
    context = CodeOptimizationContext(helpers)
    codeflash_output = detect_unused_helper_functions(entrypoint, context, code_md); unused = codeflash_output # 9.81μs -> 10.1μs (3.17% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr841-2025-10-21T21.27.16 and push.

Codeflash

**Key optimizations made:**
- Avoided building unnecessary generator lists during AST traversal by reducing reliance on `ast.walk` in `_analyze_imports_in_optimized_code`, switching to a manual stack-based visitor that only scans body attributes, which is all that's needed for import analysis.
- Cached local variable lookups where they are inside loops for reduced global lookup overhead.
- Used `set.isdisjoint` for checking if helper names are unused, which is faster (short-circuits) than set intersection then `if not ...`.
- Used in-place .add and .update to the `called_function_names` set to save attribute/method lookup costs.
- Multiple small memory and speed optimizations by flattening variable accesses and minimizing unnecessary structure copying. 

All comments, signatures, and behaviors are preserved and the code structure is unchanged unless a change was necessary for optimization.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 21, 2025
@codeflash-ai codeflash-ai bot mentioned this pull request Oct 21, 2025
@misrasaurabh1
Copy link
Contributor

too hard to review

@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr841-2025-10-21T21.27.16 branch October 21, 2025 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant