⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR #384 (`trace-and-optimize`) #466

codeflash-ai · 2025-07-01T22:08:49Z

⚡️ This pull request contains optimizations for PR #384

If you approve this dependent PR, these changes will be merged into the original PR branch trace-and-optimize.

This PR will be automatically closed if the original PR is merged.

📄 51% (0.51x) speedup for `FunctionRanker._get_function_stats` in `codeflash/benchmarking/function_ranker.py`

⏱️ Runtime : 497 microseconds → 330 microseconds (best of 51 runs)

📝 Explanation and details

Here is an optimized version of your code, focusing on the _get_function_stats function—the proven performance bottleneck per your line profiing.

Optimizations Applied

Avoid Building Unneeded Lists:
- Creating possible_keys as a list incurs per-call overhead.
- Instead, directly check both keys in sequence, avoiding the list entirely.
Short-circuit Early Return:
- Check for the first key (qualified_name) and return immediately if found (no need to compute or check the second unless necessary).
String Formatting Optimization:
- Use f-strings directly in the condition rather than storing/interpolating beforehand.
Comment Retention:
- All existing and relevant comments are preserved, though your original snippet has no in-method comments.

Rationale

No lists or unneeded temporary objects are constructed.
Uses .get, which is faster than in + lookup.
Returns immediately upon match.

This change will reduce total runtime and memory usage significantly in codebases with many calls to _get_function_stats.
Function signatures and return values are unchanged.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 1027 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from pathlib import Path

# imports
import pytest
from codeflash.benchmarking.function_ranker import FunctionRanker


# Minimal stubs for dependencies
class FunctionToOptimize:
    def __init__(self, file_path: str, function_name: str, qualified_name: str):
        self.file_path = file_path
        self.function_name = function_name
        self.qualified_name = qualified_name

class ProfileStats:
    """
    Minimal stub for ProfileStats.
    Accepts a dictionary to simulate the .stats attribute.
    """
    def __init__(self, stats_dict_or_path):
        # If a dict is passed, use it; otherwise, assume it's a file path (simulate empty)
        if isinstance(stats_dict_or_path, dict):
            self.stats = stats_dict_or_path
        else:
            self.stats = {}

# Dummy logger for compatibility
class DummyLogger:
    def debug(self, msg): pass
    def warning(self, msg): pass
logger = DummyLogger()
from codeflash.benchmarking.function_ranker import FunctionRanker

# ---------------------- UNIT TESTS ----------------------

# Helper to create a FunctionRanker with custom stats
def make_ranker_with_stats(stats):
    # stats: dict of (filename, lineno, funcname): (call_count, num_callers, total_time_ns, cumulative_time_ns, callers)
    profile_stats = ProfileStats(stats)
    return FunctionRanker(trace_file_path=Path("fake/path/file.py"), profile_stats=profile_stats)

# 1. BASIC TEST CASES





















from __future__ import annotations

from pathlib import Path

# imports
import pytest
from codeflash.benchmarking.function_ranker import FunctionRanker


# Minimal stub for FunctionToOptimize
class FunctionToOptimize:
    def __init__(self, file_path: str, function_name: str, qualified_name: str):
        self.file_path = file_path
        self.function_name = function_name
        self.qualified_name = qualified_name

# ========== UNIT TESTS ==========

# Helper function to create a FunctionRanker with custom stats
def make_ranker_with_stats(stats):
    # Patch ProfileStats to inject stats
    class DummyProfileStats:
        def __init__(self, path):
            self.stats = stats
    # Patch in our dummy ProfileStats
    orig = FunctionRanker.__init__
    def patched_init(self, trace_file_path):
        self.trace_file_path = trace_file_path
        self._profile_stats = DummyProfileStats(trace_file_path.as_posix())
        self._function_stats = {}
        self.load_function_stats()
    FunctionRanker.__init__ = patched_init
    ranker = FunctionRanker(Path("dummy/path"))
    FunctionRanker.__init__ = orig  # restore
    return ranker

# ---------- BASIC TEST CASES ----------

def test_basic_function_match_by_qualified_name():
    # Test: function is present, match by qualified_name
    stats = {
        ("foo.py", 10, "myfunc"): (5, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.57μs -> 1.16μs (35.4% faster)

def test_basic_function_match_by_function_name():
    # Test: function is present, only function_name matches (qualified_name does not)
    stats = {
        ("foo.py", 10, "myfunc"): (5, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "not_myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.64μs -> 1.41μs (16.3% faster)

def test_basic_function_not_found():
    # Test: function is not present
    stats = {
        ("foo.py", 10, "myfunc"): (5, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "otherfunc", "otherfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.39μs -> 1.22μs (14.0% faster)


def test_edge_zero_call_count_function_ignored():
    # Test: function with call_count 0 is ignored
    stats = {
        ("foo.py", 10, "myfunc"): (0, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.32μs -> 1.24μs (6.52% faster)

def test_edge_negative_call_count_function_ignored():
    # Test: function with negative call_count is ignored
    stats = {
        ("foo.py", 10, "myfunc"): (-1, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.22μs -> 1.20μs (1.66% faster)

def test_edge_cumulative_less_than_total_time():
    # Test: cumulative_time_ns < total_time_ns (should allow negative time_in_callees)
    stats = {
        ("foo.py", 10, "myfunc"): (3, 1, 200, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.33μs -> 1.03μs (29.2% faster)

def test_edge_class_method_name_parsing():
    # Test: function name with class prefix is parsed correctly
    stats = {
        ("foo.py", 15, "MyClass.my_method"): (4, 1, 80, 120, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "my_method", "MyClass.my_method")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.37μs -> 1.05μs (30.5% faster)



def test_edge_function_with_empty_function_name():
    # Test: function with empty function name
    stats = {
        ("foo.py", 40, ""): (1, 1, 10, 10, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "", "")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.23μs -> 941ns (30.9% faster)

def test_edge_function_with_long_names():
    # Test: function with very long name
    long_name = "a" * 200
    stats = {
        ("foo.py", 50, long_name): (1, 1, 20, 30, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", long_name, long_name)
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.39μs -> 1.15μs (20.9% faster)

def test_edge_function_with_special_characters():
    # Test: function name with special characters
    special_name = "func$#@!"
    stats = {
        ("foo.py", 60, special_name): (1, 1, 5, 5, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", special_name, special_name)
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.19μs -> 932ns (27.9% faster)

def test_edge_multiple_functions_same_name_different_files():
    # Test: same function name in different files
    stats = {
        ("foo.py", 10, "myfunc"): (1, 1, 10, 20, {}),
        ("bar.py", 10, "myfunc"): (2, 1, 20, 30, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto_foo = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    fto_bar = FunctionToOptimize("bar.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto_foo); result_foo = codeflash_output # 1.21μs -> 992ns (22.2% faster)
    codeflash_output = ranker._get_function_stats(fto_bar); result_bar = codeflash_output # 632ns -> 461ns (37.1% faster)

def test_edge_function_with_non_ascii_name():
    # Test: function name with non-ASCII unicode characters
    unicode_name = "функция"
    stats = {
        ("foo.py", 70, unicode_name): (1, 1, 15, 25, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", unicode_name, unicode_name)
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.48μs -> 1.18μs (25.5% faster)

def test_edge_function_with_none_stats():
    # Test: ProfileStats.stats is None (should not crash)
    class DummyProfileStats:
        def __init__(self, path):
            self.stats = None
    orig = FunctionRanker.__init__
    def patched_init(self, trace_file_path):
        self.trace_file_path = trace_file_path
        self._profile_stats = DummyProfileStats(trace_file_path.as_posix())
        self._function_stats = {}
        try:
            self.load_function_stats()
        except Exception:
            pass
    FunctionRanker.__init__ = patched_init
    ranker = FunctionRanker(Path("dummy/path"))
    FunctionRanker.__init__ = orig

# ---------- LARGE SCALE TEST CASES ----------

def test_large_scale_many_functions():
    # Test: 1000 functions, ensure correct one is found
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    # Pick a few random indexes to check
    for idx in [0, 100, 500, 999]:
        fto = FunctionToOptimize(f"file_{idx}.py", f"func_{idx}", f"func_{idx}")
        codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_lookup_not_found():
    # Test: 1000 functions, lookup for a function not present
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("notafile.py", "notafunc", "notafunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_all_zero_call_count():
    # Test: 1000 functions, all call_count=0, none should be found
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (0, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("file_500.py", "func_500", "func_500")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_class_methods():
    # Test: 1000 class methods, ensure correct parsing and lookup
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"Class{i}.method_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    for idx in [0, 123, 456, 999]:
        fto = FunctionToOptimize(f"file_{idx}.py", f"method_{idx}", f"Class{idx}.method_{idx}")
        codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_performance():
    # Test: 1000 functions, ensure lookup is fast (functional test, not timing)
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    # Lookup for all 1000
    for i in range(1000):
        fto = FunctionToOptimize(f"file_{i}.py", f"func_{i}", f"func_{i}")
        codeflash_output = ranker._get_function_stats(fto); result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr384-2025-07-01T22.08.43 and push.

…384 (`trace-and-optimize`) Here is an **optimized** version of your code, focusing on the `_get_function_stats` function—the proven performance bottleneck per your line profiing. ### Optimizations Applied 1. **Avoid Building Unneeded Lists**: - Creating `possible_keys` as a list incurs per-call overhead. - Instead, directly check both keys in sequence, avoiding the list entirely. 2. **Short-circuit Early Return**: - Check for the first key (`qualified_name`) and return immediately if found (no need to compute or check the second unless necessary). 3. **String Formatting Optimization**: - Use f-strings directly in the condition rather than storing/interpolating beforehand. 4. **Comment Retention**: - All existing and relevant comments are preserved, though your original snippet has no in-method comments. --- --- ### Rationale - **No lists** or unneeded temporary objects are constructed. - Uses `.get`, which is faster than `in` + lookup. - Returns immediately upon match. --- **This change will reduce total runtime and memory usage significantly in codebases with many calls to `_get_function_stats`.** Function signatures and return values are unchanged.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 1, 2025

codeflash-ai bot mentioned this pull request Jul 1, 2025

introduce a new integrated "codeflash optimize" command #384

Merged

misrasaurabh1 merged commit 67bd717 into trace-and-optimize Jul 1, 2025
11 of 17 checks passed

codeflash-ai bot deleted the codeflash/optimize-pr384-2025-07-01T22.08.43 branch July 1, 2025 22:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR #384 (`trace-and-optimize`) #466

⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR #384 (`trace-and-optimize`) #466

Uh oh!

codeflash-ai bot commented Jul 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method FunctionRanker._get_function_stats by 51% in PR #384 (trace-and-optimize) #466

⚡️ Speed up method FunctionRanker._get_function_stats by 51% in PR #384 (trace-and-optimize) #466

Uh oh!

Conversation

codeflash-ai bot commented Jul 1, 2025

⚡️ This pull request contains optimizations for PR #384

📄 51% (0.51x) speedup for FunctionRanker._get_function_stats in codeflash/benchmarking/function_ranker.py

📝 Explanation and details

Optimizations Applied

Rationale

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR #384 (`trace-and-optimize`) #466

⚡️ Speed up method `FunctionRanker._get_function_stats` by 51% in PR #384 (`trace-and-optimize`) #466

📄 51% (0.51x) speedup for `FunctionRanker._get_function_stats` in `codeflash/benchmarking/function_ranker.py`