Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Jul 1, 2025

⚡️ This pull request contains optimizations for PR #384

If you approve this dependent PR, these changes will be merged into the original PR branch trace-and-optimize.

This PR will be automatically closed if the original PR is merged.


📄 51% (0.51x) speedup for FunctionRanker._get_function_stats in codeflash/benchmarking/function_ranker.py

⏱️ Runtime : 497 microseconds 330 microseconds (best of 51 runs)

📝 Explanation and details

Here is an optimized version of your code, focusing on the _get_function_stats function—the proven performance bottleneck per your line profiing.

Optimizations Applied

  1. Avoid Building Unneeded Lists:

    • Creating possible_keys as a list incurs per-call overhead.
    • Instead, directly check both keys in sequence, avoiding the list entirely.
  2. Short-circuit Early Return:

    • Check for the first key (qualified_name) and return immediately if found (no need to compute or check the second unless necessary).
  3. String Formatting Optimization:

    • Use f-strings directly in the condition rather than storing/interpolating beforehand.
  4. Comment Retention:

    • All existing and relevant comments are preserved, though your original snippet has no in-method comments.


Rationale

  • No lists or unneeded temporary objects are constructed.
  • Uses .get, which is faster than in + lookup.
  • Returns immediately upon match.

This change will reduce total runtime and memory usage significantly in codebases with many calls to _get_function_stats.
Function signatures and return values are unchanged.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1027 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from pathlib import Path

# imports
import pytest
from codeflash.benchmarking.function_ranker import FunctionRanker


# Minimal stubs for dependencies
class FunctionToOptimize:
    def __init__(self, file_path: str, function_name: str, qualified_name: str):
        self.file_path = file_path
        self.function_name = function_name
        self.qualified_name = qualified_name

class ProfileStats:
    """
    Minimal stub for ProfileStats.
    Accepts a dictionary to simulate the .stats attribute.
    """
    def __init__(self, stats_dict_or_path):
        # If a dict is passed, use it; otherwise, assume it's a file path (simulate empty)
        if isinstance(stats_dict_or_path, dict):
            self.stats = stats_dict_or_path
        else:
            self.stats = {}

# Dummy logger for compatibility
class DummyLogger:
    def debug(self, msg): pass
    def warning(self, msg): pass
logger = DummyLogger()
from codeflash.benchmarking.function_ranker import FunctionRanker

# ---------------------- UNIT TESTS ----------------------

# Helper to create a FunctionRanker with custom stats
def make_ranker_with_stats(stats):
    # stats: dict of (filename, lineno, funcname): (call_count, num_callers, total_time_ns, cumulative_time_ns, callers)
    profile_stats = ProfileStats(stats)
    return FunctionRanker(trace_file_path=Path("fake/path/file.py"), profile_stats=profile_stats)

# 1. BASIC TEST CASES





















from __future__ import annotations

from pathlib import Path

# imports
import pytest
from codeflash.benchmarking.function_ranker import FunctionRanker


# Minimal stub for FunctionToOptimize
class FunctionToOptimize:
    def __init__(self, file_path: str, function_name: str, qualified_name: str):
        self.file_path = file_path
        self.function_name = function_name
        self.qualified_name = qualified_name

# ========== UNIT TESTS ==========

# Helper function to create a FunctionRanker with custom stats
def make_ranker_with_stats(stats):
    # Patch ProfileStats to inject stats
    class DummyProfileStats:
        def __init__(self, path):
            self.stats = stats
    # Patch in our dummy ProfileStats
    orig = FunctionRanker.__init__
    def patched_init(self, trace_file_path):
        self.trace_file_path = trace_file_path
        self._profile_stats = DummyProfileStats(trace_file_path.as_posix())
        self._function_stats = {}
        self.load_function_stats()
    FunctionRanker.__init__ = patched_init
    ranker = FunctionRanker(Path("dummy/path"))
    FunctionRanker.__init__ = orig  # restore
    return ranker

# ---------- BASIC TEST CASES ----------

def test_basic_function_match_by_qualified_name():
    # Test: function is present, match by qualified_name
    stats = {
        ("foo.py", 10, "myfunc"): (5, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.57μs -> 1.16μs (35.4% faster)

def test_basic_function_match_by_function_name():
    # Test: function is present, only function_name matches (qualified_name does not)
    stats = {
        ("foo.py", 10, "myfunc"): (5, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "not_myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.64μs -> 1.41μs (16.3% faster)

def test_basic_function_not_found():
    # Test: function is not present
    stats = {
        ("foo.py", 10, "myfunc"): (5, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "otherfunc", "otherfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.39μs -> 1.22μs (14.0% faster)


def test_edge_zero_call_count_function_ignored():
    # Test: function with call_count 0 is ignored
    stats = {
        ("foo.py", 10, "myfunc"): (0, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.32μs -> 1.24μs (6.52% faster)

def test_edge_negative_call_count_function_ignored():
    # Test: function with negative call_count is ignored
    stats = {
        ("foo.py", 10, "myfunc"): (-1, 1, 100, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.22μs -> 1.20μs (1.66% faster)

def test_edge_cumulative_less_than_total_time():
    # Test: cumulative_time_ns < total_time_ns (should allow negative time_in_callees)
    stats = {
        ("foo.py", 10, "myfunc"): (3, 1, 200, 150, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.33μs -> 1.03μs (29.2% faster)

def test_edge_class_method_name_parsing():
    # Test: function name with class prefix is parsed correctly
    stats = {
        ("foo.py", 15, "MyClass.my_method"): (4, 1, 80, 120, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "my_method", "MyClass.my_method")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.37μs -> 1.05μs (30.5% faster)



def test_edge_function_with_empty_function_name():
    # Test: function with empty function name
    stats = {
        ("foo.py", 40, ""): (1, 1, 10, 10, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", "", "")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.23μs -> 941ns (30.9% faster)

def test_edge_function_with_long_names():
    # Test: function with very long name
    long_name = "a" * 200
    stats = {
        ("foo.py", 50, long_name): (1, 1, 20, 30, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", long_name, long_name)
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.39μs -> 1.15μs (20.9% faster)

def test_edge_function_with_special_characters():
    # Test: function name with special characters
    special_name = "func$#@!"
    stats = {
        ("foo.py", 60, special_name): (1, 1, 5, 5, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", special_name, special_name)
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.19μs -> 932ns (27.9% faster)

def test_edge_multiple_functions_same_name_different_files():
    # Test: same function name in different files
    stats = {
        ("foo.py", 10, "myfunc"): (1, 1, 10, 20, {}),
        ("bar.py", 10, "myfunc"): (2, 1, 20, 30, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto_foo = FunctionToOptimize("foo.py", "myfunc", "myfunc")
    fto_bar = FunctionToOptimize("bar.py", "myfunc", "myfunc")
    codeflash_output = ranker._get_function_stats(fto_foo); result_foo = codeflash_output # 1.21μs -> 992ns (22.2% faster)
    codeflash_output = ranker._get_function_stats(fto_bar); result_bar = codeflash_output # 632ns -> 461ns (37.1% faster)

def test_edge_function_with_non_ascii_name():
    # Test: function name with non-ASCII unicode characters
    unicode_name = "функция"
    stats = {
        ("foo.py", 70, unicode_name): (1, 1, 15, 25, {}),
    }
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("foo.py", unicode_name, unicode_name)
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output # 1.48μs -> 1.18μs (25.5% faster)

def test_edge_function_with_none_stats():
    # Test: ProfileStats.stats is None (should not crash)
    class DummyProfileStats:
        def __init__(self, path):
            self.stats = None
    orig = FunctionRanker.__init__
    def patched_init(self, trace_file_path):
        self.trace_file_path = trace_file_path
        self._profile_stats = DummyProfileStats(trace_file_path.as_posix())
        self._function_stats = {}
        try:
            self.load_function_stats()
        except Exception:
            pass
    FunctionRanker.__init__ = patched_init
    ranker = FunctionRanker(Path("dummy/path"))
    FunctionRanker.__init__ = orig

# ---------- LARGE SCALE TEST CASES ----------

def test_large_scale_many_functions():
    # Test: 1000 functions, ensure correct one is found
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    # Pick a few random indexes to check
    for idx in [0, 100, 500, 999]:
        fto = FunctionToOptimize(f"file_{idx}.py", f"func_{idx}", f"func_{idx}")
        codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_lookup_not_found():
    # Test: 1000 functions, lookup for a function not present
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("notafile.py", "notafunc", "notafunc")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_all_zero_call_count():
    # Test: 1000 functions, all call_count=0, none should be found
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (0, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    fto = FunctionToOptimize("file_500.py", "func_500", "func_500")
    codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_class_methods():
    # Test: 1000 class methods, ensure correct parsing and lookup
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"Class{i}.method_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    for idx in [0, 123, 456, 999]:
        fto = FunctionToOptimize(f"file_{idx}.py", f"method_{idx}", f"Class{idx}.method_{idx}")
        codeflash_output = ranker._get_function_stats(fto); result = codeflash_output

def test_large_scale_performance():
    # Test: 1000 functions, ensure lookup is fast (functional test, not timing)
    stats = {}
    for i in range(1000):
        stats[(f"file_{i}.py", i, f"func_{i}")] = (i+1, 1, 10+i, 20+i, {})
    ranker = make_ranker_with_stats(stats)
    # Lookup for all 1000
    for i in range(1000):
        fto = FunctionToOptimize(f"file_{i}.py", f"func_{i}", f"func_{i}")
        codeflash_output = ranker._get_function_stats(fto); result = codeflash_output
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr384-2025-07-01T22.08.43 and push.

Codeflash

…384 (`trace-and-optimize`)

Here is an **optimized** version of your code, focusing on the `_get_function_stats` function—the proven performance bottleneck per your line profiing. 

### Optimizations Applied

1. **Avoid Building Unneeded Lists**:  
   - Creating `possible_keys` as a list incurs per-call overhead.  
   - Instead, directly check both keys in sequence, avoiding the list entirely.

2. **Short-circuit Early Return**:  
   - Check for the first key (`qualified_name`) and return immediately if found (no need to compute or check the second unless necessary).

3. **String Formatting Optimization**:  
   - Use f-strings directly in the condition rather than storing/interpolating beforehand.

4. **Comment Retention**:  
   - All existing and relevant comments are preserved, though your original snippet has no in-method comments.

---



---

### Rationale

- **No lists** or unneeded temporary objects are constructed.
- Uses `.get`, which is faster than `in` + lookup.
- Returns immediately upon match.

---

**This change will reduce total runtime and memory usage significantly in codebases with many calls to `_get_function_stats`.**  
Function signatures and return values are unchanged.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 1, 2025
@misrasaurabh1 misrasaurabh1 merged commit 67bd717 into trace-and-optimize Jul 1, 2025
11 of 17 checks passed
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr384-2025-07-01T22.08.43 branch July 1, 2025 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant