Skip to content

⚡️ Speed up function _collect_synthetic_constructor_type_names by 33% in PR #1860 (fix/attrs-init-instrumentation)#1865

Closed
codeflash-ai[bot] wants to merge 1 commit intofix/attrs-init-instrumentationfrom
codeflash/optimize-pr1860-2026-03-18T09.30.52
Closed

⚡️ Speed up function _collect_synthetic_constructor_type_names by 33% in PR #1860 (fix/attrs-init-instrumentation)#1865
codeflash-ai[bot] wants to merge 1 commit intofix/attrs-init-instrumentationfrom
codeflash/optimize-pr1860-2026-03-18T09.30.52

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Mar 18, 2026

⚡️ This pull request contains optimizations for PR #1860

If you approve this dependent PR, these changes will be merged into the original PR branch fix/attrs-init-instrumentation.

This PR will be automatically closed if the original PR is merged.


📄 33% (0.33x) speedup for _collect_synthetic_constructor_type_names in codeflash/languages/python/context/code_context_extractor.py

⏱️ Runtime : 1.61 milliseconds 1.21 milliseconds (best of 117 runs)

📝 Explanation and details

The optimization replaced the expensive set union operator (|=) with in-place set.update() in two recursive functions, and added an early-exit fast-path to _is_classvar_annotation that handles the common case of a simple ast.Name annotation without calling the costly _get_expr_name helper. Line profiler shows _is_classvar_annotation dropped from 6.6 ms to 2.0 ms (70% faster) because 99.6% of calls now hit the fast-path. The _expr_matches_name call frequency fell from 1,390 to 74 hits per invocation because ClassVar checks resolve immediately. Additionally, _expr_matches_name itself was reordered to test exact equality before constructing the .suffix string, saving allocations when matches are found early. Overall runtime improved 32% with no functional changes.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 49 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import ast

# import the function under test from the real module
from codeflash.languages.python.context.code_context_extractor import _collect_synthetic_constructor_type_names


def test_namedtuple_collects_annotations_and_ignores_classvar():
    src = """
from typing import NamedTuple, ClassVar

class MyNT(NamedTuple):
    a: int
    b: ClassVar[float]
    c: list[str]
"""
    module = ast.parse(src)
    class_node = module.body[1]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 9.64μs -> 8.19μs (17.7% faster)
    assert names == {"int", "list", "str"}


def test_dataclass_respects_init_flag_and_field_init_keyword():
    src = """
from dataclasses import dataclass, field

@dataclass
class DC:
    keep: int
    turned_off: float = field(init=False)
    explicit_on: str = field(init=True)
"""
    module = ast.parse(src)
    class_node = module.body[1]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 12.1μs -> 10.4μs (15.9% faster)
    assert names == {"int", "str"}


def test_dataclass_with_init_false_decorator_skips_all():
    src = """
from dataclasses import dataclass, field

@dataclass(init=False)
class DCNoInit:
    a: int
    b: str
    c: float = field(init=True)
"""
    module = ast.parse(src)
    class_node = module.body[1]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 4.49μs -> 4.41μs (1.81% faster)
    assert names == set()


def test_attrs_decorator_attribute_style_and_init_flag():
    src_collect = """
import attrs

@attrs.define
class A:
    x: int
    y: list[str]
"""
    module_collect = ast.parse(src_collect)
    class_collect = module_collect.body[1]
    names_collect = _collect_synthetic_constructor_type_names(class_collect, {})  # 12.2μs -> 11.6μs (5.63% faster)
    assert names_collect == {"int", "list", "str"}

    src_noinit = """
import attrs

@attrs.define(init=False)
class B:
    x: int
    y: str
"""
    module_noinit = ast.parse(src_noinit)
    class_noinit = module_noinit.body[1]
    names_noinit = _collect_synthetic_constructor_type_names(class_noinit, {})  # 4.65μs -> 5.10μs (8.84% slower)
    assert names_noinit == set()


def test_non_relevant_class_returns_empty_set():
    # A normal class without NamedTuple/dataclass/attrs decorators should yield no synthetic constructor names.
    src = """
class Plain:
    a: int
    b: str
"""
    module = ast.parse(src)
    class_node = module.body[0]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 1.48μs -> 1.34μs (10.4% faster)
    assert names == set()


def test_various_annotation_structures_collected():
    src = """
class C(NamedTuple):
    u: int | str
    t: tuple[int, str]
    m: dict[str, int]
    a: some_module.Type
"""
    module = ast.parse(src)
    class_node = module.body[0]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 16.6μs -> 16.2μs (1.85% faster)
    assert "int" in names and "str" in names and "dict" in names
    assert "some_module.Type" not in names


def test_annassign_without_annotation_and_non_name_targets_ignored():
    src = """
from dataclasses import dataclass

@dataclass
class NormalNT:
    good: int
    excellent: str
"""
    module = ast.parse(src)
    class_node = module.body[1]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 8.79μs -> 7.38μs (19.0% faster)
    assert names == {"int", "str"}


def test_namedtuple_with_import_aliases_for_bases():
    src = """
from typing import NamedTuple

class AliasNT(NamedTuple):
    a: int
"""
    module = ast.parse(src)
    class_node = module.body[1]
    import_aliases = {"NamedTuple": "typing.NamedTuple"}
    names = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 6.24μs -> 5.40μs (15.6% faster)
    assert names == {"int"}


def test_large_number_of_fields_performance_and_correctness():
    # Create a dataclass with 1000 annotated fields to exercise scalability.
    num_fields = 1000
    lines = ["from dataclasses import dataclass"]
    lines.append("@dataclass")
    lines.append("class Big:")
    for i in range(num_fields):
        # Use distinct type names T0, T1, ... to ensure unique collection
        lines.append(f"    f{i}: T{i}")
    src = "\n".join(lines) + "\n"
    module = ast.parse(src)
    class_node = module.body[1]  # class Big is the second top-level node
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 947μs -> 660μs (43.4% faster)
    # Expect exactly the set of T0..T{num_fields-1}
    expected = {f"T{i}" for i in range(num_fields)}
    assert names == expected
    # Spot-check a few members to ensure correctness
    assert "T0" in names and f"T{num_fields - 1}" in names


def test_large_number_with_some_classvar_ignored():
    # Mix many fields where every 10th field is a ClassVar and must be ignored.
    num_fields = 200
    lines = ["class ManyNT(NamedTuple):"]
    for i in range(num_fields):
        if i % 10 == 0:
            # ClassVar should be ignored
            lines.append(f"    cv{i}: ClassVar[CV{i}]")
        else:
            lines.append(f"    f{i}: FT{i}")
    src = "\n".join(lines) + "\n"
    module = ast.parse(src)
    class_node = module.body[0]
    names = _collect_synthetic_constructor_type_names(class_node, {})  # 193μs -> 135μs (43.4% faster)
    # Expect all FT* but none of CV*.
    expected = {f"FT{i}" for i in range(num_fields) if i % 10 != 0}
    assert names == expected
import ast

# imports
from codeflash.languages.python.context.code_context_extractor import _collect_synthetic_constructor_type_names


def test_namedtuple_simple_type():
    # Test that a NamedTuple with a simple type annotation collects the type name
    code = """
from typing import NamedTuple

class Point(NamedTuple):
    x: int
    y: str
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 8.20μs -> 6.96μs (17.7% faster)
    assert result == {"int", "str"}


def test_dataclass_simple_type():
    # Test that a dataclass with simple type annotations collects all types
    code = """
from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 8.96μs -> 7.76μs (15.4% faster)
    assert result == {"str", "int"}


def test_attrs_simple_type():
    # Test that an attrs class with simple type annotations collects all types
    code = """
import attrs

@attrs.define
class Animal:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 11.3μs -> 10.5μs (7.61% faster)
    assert result == {"str", "int"}


def test_non_synthetic_class():
    # Test that a regular class (not NamedTuple, dataclass, or attrs) returns empty set
    code = """
class Regular:
    x: int
    y: str
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 1.56μs -> 1.47μs (6.11% faster)
    assert result == set()


def test_dataclass_with_init_false():
    # Test that a dataclass with init=False returns empty set
    code = """
from dataclasses import dataclass

@dataclass(init=False)
class Person:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 4.70μs -> 4.56μs (3.05% faster)
    assert result == set()


def test_dataclass_with_init_true():
    # Test that a dataclass with init=True explicitly still collects types
    code = """
from dataclasses import dataclass

@dataclass(init=True)
class Person:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.76μs -> 8.54μs (14.3% faster)
    assert result == {"str", "int"}


def test_classvar_excluded():
    # Test that ClassVar annotations are excluded from the collected types
    code = """
from dataclasses import dataclass
from typing import ClassVar

@dataclass
class Config:
    value: int
    version: ClassVar[int]
"""
    tree = ast.parse(code)
    class_node = tree.body[2]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 8.31μs -> 7.12μs (16.7% faster)
    assert result == {"int"}


def test_field_init_false():
    # Test that fields with init=False are excluded
    code = """
from dataclasses import dataclass, field

@dataclass
class Person:
    name: str
    computed: int = field(init=False)
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.58μs -> 8.26μs (16.0% faster)
    assert result == {"str"}


def test_field_init_true():
    # Test that fields with init=True are included
    code = """
from dataclasses import dataclass, field

@dataclass
class Person:
    name: str
    age: int = field(init=True)
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.93μs -> 8.49μs (16.9% faster)
    assert result == {"str", "int"}


def test_generic_type_annotation():
    # Test that generic type annotations like List[str] collect all involved types
    code = """
from typing import List
from dataclasses import dataclass

@dataclass
class Container:
    items: List[str]
"""
    tree = ast.parse(code)
    class_node = tree.body[2]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 7.88μs -> 7.29μs (7.98% faster)
    assert result == {"List", "str"}


def test_union_type_annotation():
    # Test that union types (int | str) collect both types
    code = """
from dataclasses import dataclass

@dataclass
class Flexible:
    value: int | str
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 7.59μs -> 7.61μs (0.263% slower)
    assert result == {"int", "str"}


def test_optional_type_annotation():
    # Test that Optional types collect the inner type
    code = """
from typing import Optional
from dataclasses import dataclass

@dataclass
class MaybeValue:
    value: Optional[int]
"""
    tree = ast.parse(code)
    class_node = tree.body[2]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 7.85μs -> 7.13μs (10.1% faster)
    assert result == {"Optional", "int"}


def test_import_aliases_dataclass():
    # Test that import aliases are correctly resolved for dataclass decorator
    code = """
from dataclasses import dataclass as dc

@dc
class Person:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {"dc": "dataclasses.dataclass"}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.17μs -> 8.41μs (9.05% faster)
    assert result == {"str", "int"}


def test_import_aliases_field():
    # Test that import aliases are correctly resolved for field decorator
    code = """
from dataclasses import dataclass, field as f

@dataclass
class Person:
    name: str
    computed: int = f(init=False)
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {"f": "dataclasses.field"}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.93μs -> 9.30μs (6.78% faster)
    assert result == {"str"}


def test_empty_class_body():
    # Test that a dataclass with no annotated attributes returns empty set
    code = """
from dataclasses import dataclass

@dataclass
class Empty:
    pass
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 4.95μs -> 4.76μs (3.99% faster)
    assert result == set()


def test_only_methods_no_annotations():
    # Test that a class with only methods (no AnnAssign) returns empty set
    code = """
from dataclasses import dataclass

@dataclass
class OnlyMethods:
    def method(self):
        pass
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 4.95μs -> 4.85μs (2.06% faster)
    assert result == set()


def test_assignment_without_annotation():
    # Test that regular assignments (without type annotation) are ignored
    code = """
from dataclasses import dataclass

@dataclass
class Mixed:
    x = 5
    y: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 7.50μs -> 6.61μs (13.5% faster)
    assert result == {"int"}


def test_annotation_with_no_value():
    # Test that annotations without values are still processed
    code = """
from dataclasses import dataclass

@dataclass
class NoValue:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 8.61μs -> 7.32μs (17.5% faster)
    assert result == {"str", "int"}


def test_complex_nested_generics():
    # Test that deeply nested generic types collect all type names
    code = """
from typing import Dict, List, Tuple
from dataclasses import dataclass

@dataclass
class Complex:
    data: Dict[str, List[Tuple[int, str]]]
"""
    tree = ast.parse(code)
    class_node = tree.body[2]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 12.2μs -> 11.3μs (8.55% faster)
    assert "Dict" in result and "List" in result and "Tuple" in result
    assert "str" in result and "int" in result


def test_multiple_unions():
    # Test that multiple union types are all collected
    code = """
from dataclasses import dataclass

@dataclass
class MultiUnion:
    value1: int | str | float
    value2: bool | None
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 10.2μs -> 10.4μs (2.03% slower)
    assert "int" in result and "str" in result and "float" in result
    assert "bool" in result


def test_classvar_with_generics():
    # Test that ClassVar with complex generics is properly excluded
    code = """
from dataclasses import dataclass
from typing import ClassVar, Dict

@dataclass
class WithClassVar:
    name: str
    registry: ClassVar[Dict[str, int]]
"""
    tree = ast.parse(code)
    class_node = tree.body[2]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 8.15μs -> 7.00μs (16.5% faster)
    # Only 'str' from 'name' should be included; registry is ClassVar
    assert result == {"str"}


def test_field_with_non_bool_value():
    # Test that field() with non-boolean init value doesn't crash
    code = """
from dataclasses import dataclass, field

@dataclass
class Tricky:
    name: str
    age: int = field(init="maybe")
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 10.1μs -> 8.78μs (15.1% faster)
    # Should treat 'init="maybe"' as no explicit value and include it
    assert result == {"str", "int"}


def test_attrs_with_init_false():
    # Test that attrs class with init=False returns empty set
    code = """
import attrs

@attrs.define(init=False)
class NoInit:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 7.27μs -> 7.49μs (2.95% slower)
    assert result == set()


def test_attrs_mutable_decorator():
    # Test that attrs.mutable decorator is recognized
    code = """
import attrs

@attrs.mutable
class Mutable:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 11.2μs -> 10.2μs (9.22% faster)
    assert result == {"str", "int"}


def test_attrs_frozen_decorator():
    # Test that attrs.frozen decorator is recognized
    code = """
import attrs

@attrs.frozen
class Frozen:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 10.9μs -> 9.97μs (9.36% faster)
    assert result == {"str", "int"}


def test_attrs_short_names():
    # Test that short attrs decorator names like 's' and 'attrs' are recognized
    code = """
import attrs

@attrs.s
class Short:
    name: str
    age: int
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 10.6μs -> 10.0μs (5.49% faster)
    assert result == {"str", "int"}


def test_tuple_type_annotation():
    # Test that tuple type annotations collect all element types
    code = """
from dataclasses import dataclass

@dataclass
class TupleHolder:
    coords: tuple[int, str, float]
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.99μs -> 9.48μs (5.40% faster)
    assert "tuple" in result or "int" in result  # tuple or int should be present


def test_custom_type_names():
    # Test that custom user-defined type names are collected
    code = """
from dataclasses import dataclass

@dataclass
class Custom:
    obj: MyCustomType
    items: list[AnotherType]
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.32μs -> 8.15μs (14.2% faster)
    assert "MyCustomType" in result
    assert "list" in result
    assert "AnotherType" in result


def test_namedtuple_with_defaults():
    # Test NamedTuple with default values (Python 3.11+ style)
    code = """
from typing import NamedTuple

class Point(NamedTuple):
    x: int = 0
    y: str = "origin"
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 7.79μs -> 6.40μs (21.8% faster)
    assert result == {"int", "str"}


def test_field_with_multiple_keywords():
    # Test field() with multiple keyword arguments including init
    code = """
from dataclasses import dataclass, field

@dataclass
class MultiKwargs:
    name: str
    computed: int = field(default=10, init=False, repr=False)
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 9.65μs -> 8.49μs (13.7% faster)
    assert result == {"str"}


def test_mixed_init_true_false():
    # Test dataclass with mixed init=True and init=False fields
    code = """
from dataclasses import dataclass, field

@dataclass
class Mixed:
    a: int = field(init=True)
    b: str = field(init=False)
    c: float = field(init=True)
    d: bool
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 13.5μs -> 11.8μs (14.6% faster)
    assert result == {"int", "float", "bool"}
    assert "str" not in result


def test_many_fields_dataclass():
    # Test a dataclass with realistic number of fields across multiple types
    code = """
from dataclasses import dataclass

@dataclass
class UserProfile:
    id: int
    username: str
    email: str
    age: int
    is_active: bool
    balance: float
    nickname: str
    postal_code: str
    phone: str
    country: str
    region: str
    city: str
    street: str
    bio: str
    website: str
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 21.7μs -> 16.4μs (31.9% faster)
    assert result == {"int", "str", "bool", "float"}


def test_many_fields_with_field_init_false():
    # Test realistic dataclass with mix of init and non-init fields
    code = """
from dataclasses import dataclass, field

@dataclass
class CacheEntry:
    key: str
    value: int = field(init=False)
    timestamp: float
    ttl: int = field(init=False)
    version: str
    checksum: str = field(init=False)
    source: str
    priority: int
    locked: bool = field(init=False)
    metadata: str = field(init=False)
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 19.8μs -> 16.1μs (23.1% faster)
    assert result == {"str", "float", "int", "bool"}


def test_deep_generic_nesting():
    # Test deeply nested generic types
    code = """
from typing import Dict, List
from dataclasses import dataclass

@dataclass
class DeepNesting:
    data: Dict[str, List[Dict[int, List[str]]]]
"""
    tree = ast.parse(code)
    class_node = tree.body[2]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 12.8μs -> 12.2μs (5.19% faster)
    assert "Dict" in result
    assert "List" in result
    assert "str" in result
    assert "int" in result


def test_many_import_aliases():
    # Test that import aliases work correctly in realistic scenarios
    code = """
from dataclasses import dataclass

@dataclass
class Document:
    title: str
    content: str
    author: str
    created_at: str
    modified_at: str
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {
        "Optional": "typing.Optional",
        "List": "typing.List",
        "Dict": "typing.Dict",
        "CustomType": "mymodule.CustomType",
    }
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 11.8μs -> 9.46μs (24.9% faster)
    assert result == {"str"}


def test_many_fields_with_complex_field_calls():
    # Test dataclass with realistic field() usage patterns
    code = """
from dataclasses import dataclass, field

@dataclass
class Configuration:
    name: str
    debug: bool = field(default=False)
    timeout: int = field(default=30)
    retries: int = field(default=3)
    cache_size: int = field(default=100)
    log_level: str = field(default="INFO")
    workers: int = field(default=4)
    pool_size: int = field(default=10)
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 18.9μs -> 15.7μs (20.6% faster)
    assert result == {"str", "bool", "int"}


def test_attrs_with_many_fields():
    # Test attrs class with realistic number of fields
    code = """
import attrs

@attrs.define
class HTTPRequest:
    method: str
    url: str
    headers: dict
    body: str
    timeout: int
    retries: int
    follow_redirects: bool
    verify_ssl: bool
    auth_token: str
    user_agent: str
    proxy: str
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 20.4μs -> 17.2μs (18.6% faster)
    assert "str" in result
    assert "dict" in result
    assert "int" in result
    assert "bool" in result


def test_namedtuple_with_many_fields():
    # Test NamedTuple with realistic number of fields
    code = """
from typing import NamedTuple

class GeoPoint(NamedTuple):
    latitude: float
    longitude: float
    altitude: float
    accuracy: float
    timestamp: int
    source: str
    provider: str
    speed: float
"""
    tree = ast.parse(code)
    class_node = tree.body[1]
    import_aliases = {}
    result = _collect_synthetic_constructor_type_names(class_node, import_aliases)  # 13.8μs -> 10.8μs (27.2% faster)
    assert result == {"float", "int", "str"}

To edit these changes git checkout codeflash/optimize-pr1860-2026-03-18T09.30.52 and push.

Codeflash Static Badge

The optimization replaced the expensive set union operator (`|=`) with in-place `set.update()` in two recursive functions, and added an early-exit fast-path to `_is_classvar_annotation` that handles the common case of a simple `ast.Name` annotation without calling the costly `_get_expr_name` helper. Line profiler shows `_is_classvar_annotation` dropped from 6.6 ms to 2.0 ms (70% faster) because 99.6% of calls now hit the fast-path. The `_expr_matches_name` call frequency fell from 1,390 to 74 hits per invocation because ClassVar checks resolve immediately. Additionally, `_expr_matches_name` itself was reordered to test exact equality before constructing the `.suffix` string, saving allocations when matches are found early. Overall runtime improved 32% with no functional changes.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Mar 18, 2026
@claude
Copy link
Contributor

claude bot commented Mar 18, 2026

Claude finished @codeflash-ai[bot]'s task in 40s —— View job


PR Review Summary

  • Triage PR scope
  • Lint and type check
  • Resolve stale threads
  • Code review
  • Duplicate detection
  • Test coverage
  • Summary

Prek Checks

ruff check and ruff format both pass. mypy reports no issues. No fixes needed.

Code Review

The four changes are all semantically correct micro-optimizations:

  1. set.update() vs |= (lines 723, 727, 953): set.update() avoids the __ior__ dispatch overhead while being semantically identical. ✅

  2. Early-exit in _expr_matches_name (lines 813–816): Splits expr_name == suffix or expr_name.endswith(suffix_dot) into two separate checks. Exact match now returns before the "." + suffix string allocation. Correct — the behavior is unchanged. ✅

  3. Fast-path in _is_classvar_annotation (lines 895–905): The most impactful change. When annotation_root is an ast.Name (the overwhelmingly common case), it handles the match inline without calling _get_expr_name() or _expr_matches_name(). The logic exactly mirrors what those helpers would do for ast.Name nodes (simple names can't contain ., so endswith(".ClassVar") is always false and skipped). For all other node types, it falls through to the original _expr_matches_name call. ✅

No bugs, security issues, or correctness problems found.

Duplicate Detection

No duplicates detected. The three functions modified (collect_type_names_from_annotation, _expr_matches_name, _is_classvar_annotation) exist only in codeflash/languages/python/context/code_context_extractor.py.

Test Coverage

  • code_context_extractor.py: 84% coverage (1087 stmts, 171 missed) — well above the 75% threshold.
  • Existing unit tests cover collect_type_names_from_annotation (lines 4371–4420 in tests/test_code_context_extractor.py).
  • The PR includes 49 generated regression tests covering _collect_synthetic_constructor_type_names with 100% coverage per the codeflash report.
  • 3515 tests passed (1 pre-existing failure in test_tracer.py::test_tracer_initialization_normal, unrelated to this PR).
  • CI checks (prek, type-check-cli, snyk) have passed; other checks are still pending.

Optimization PR Status

This is the only open codeflash-ai optimization PR. CI is still running — eligible to merge once all checks complete.


Last updated: 2026-03-18

@KRRT7 KRRT7 closed this Mar 18, 2026
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr1860-2026-03-18T09.30.52 branch March 18, 2026 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant