Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Oct 22, 2025

📄 23% (0.23x) speedup for get_validator_usage_attributes in guardrails/hub_telemetry/hub_tracing.py

⏱️ Runtime : 1.91 milliseconds 1.55 milliseconds (best of 79 runs)

📝 Explanation and details

The optimization adds a specialized fast path for tuple indexing in the safe_get function. Instead of always falling back to the expensive safe_get_with_brackets function for non-dict containers, the optimized version directly handles tuples with a simple try/except block.

Key changes:

  • Added elif isinstance(container, tuple): check with direct container[key] access
  • This avoids the overhead of calling safe_get_with_brackets, which includes debug logging and additional exception handling logic

Why it's faster:
The line profiler shows that safe_get was spending 80.7% of its time in safe_get_with_brackets calls. Since args is a tuple in the main use case (safe_get(args, 1)), the optimization eliminates this expensive function call and replaces it with direct tuple indexing, reducing total time from 3.95ms to 2.72ms (31% faster in safe_get alone).

Test case performance:

  • Best improvements (100%+ faster): Cases with missing or short args tuples, where the original code unnecessarily called the expensive fallback function
  • Consistent gains (20-30% faster): Standard cases with valid tuple access, benefiting from eliminated function call overhead
  • Smaller gains (6-12% faster): Large-scale tests where other operations dominate, but still show measurable improvement

The optimization is particularly effective for the common telemetry pattern of accessing validator objects from argument tuples.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2627 Passed
⏪ Replay Tests 255 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 71.4%
🌀 Generated Regression Tests and Runtime
from typing import Any, Dict, List, Optional, Tuple, Union

# imports
import pytest
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes


# --- Dummy ValidationResult for testing ---
class ValidationResult:
    def __init__(self, outcome):
        self.outcome = outcome

# --- Dummy Validator Service for testing ---
class DummyValidator:
    def __init__(self, rail_alias, on_fail_descriptor):
        self.rail_alias = rail_alias
        self.on_fail_descriptor = on_fail_descriptor
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes

# --- Unit Tests ---

# 1. Basic Test Cases

def test_basic_with_validator_and_validation_result():
    """Basic: validator_self present, response is ValidationResult."""
    attrs = {}
    validator = DummyValidator("my_validator", "fail_action")
    response = ValidationResult("passed")
    # args: [None, validator]
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 3.12μs -> 2.52μs (23.6% faster)

def test_basic_with_validator_and_non_validation_result():
    """Basic: validator_self present, response is not ValidationResult."""
    attrs = {}
    validator = DummyValidator("my_validator", "fail_action")
    response = "not_a_validation_result"
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.79μs -> 2.33μs (19.7% faster)

def test_basic_without_validator_with_validation_result():
    """Basic: validator_self missing, response is ValidationResult."""
    attrs = {}
    response = ValidationResult("failed")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 2.61μs -> 2.20μs (18.3% faster)

def test_basic_without_validator_and_response_none():
    """Basic: validator_self missing, response is None."""
    attrs = {}
    codeflash_output = get_validator_usage_attributes(attrs, None, None, None); result = codeflash_output # 1.40μs -> 1.25μs (12.4% faster)

def test_basic_with_prepopulated_attrs():
    """Basic: attrs dict prepopulated, should update/overwrite relevant keys."""
    attrs = {"validator_name": "old", "validator_on_fail": "oldfail", "validator_result": "oldresult"}
    validator = DummyValidator("new_validator", "newfail")
    response = ValidationResult("newresult")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.78μs -> 2.26μs (22.7% faster)

# 2. Edge Test Cases

def test_edge_args_too_short():
    """Edge: args has fewer than 2 elements, so validator_self is None."""
    attrs = {}
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None); result = codeflash_output # 6.58μs -> 2.57μs (156% faster)

def test_edge_validator_self_is_none():
    """Edge: validator_self is explicitly None."""
    attrs = {}
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 2.48μs -> 2.07μs (19.6% faster)

def test_edge_validator_has_falsey_attributes():
    """Edge: validator_self rail_alias/on_fail_descriptor are falsey values."""
    attrs = {}
    validator = DummyValidator("", None)
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.76μs -> 2.23μs (23.7% faster)

def test_edge_response_is_none():
    """Edge: response is None, validator_self present."""
    attrs = {}
    validator = DummyValidator("edge_validator", "edge_fail")
    codeflash_output = get_validator_usage_attributes(attrs, None, None, validator); result = codeflash_output # 1.74μs -> 1.46μs (19.4% faster)

def test_edge_response_is_falsey_non_validation_result():
    """Edge: response is falsey but not None, not ValidationResult."""
    attrs = {}
    validator = DummyValidator("edge_validator", "edge_fail")
    response = ""
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.82μs -> 2.25μs (25.3% faster)


def test_edge_attrs_is_not_empty_dict():
    """Edge: attrs dict has extra unrelated keys, should preserve them."""
    attrs = {"foo": "bar"}
    validator = DummyValidator("name", "fail")
    response = ValidationResult("outcome")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.85μs -> 2.36μs (20.7% faster)

# 3. Large Scale Test Cases

def test_large_scale_many_validators():
    """Large scale: test with many different validators and attrs."""
    attrs = {}
    num_validators = 500
    validators = [DummyValidator(f"validator_{i}", f"fail_{i}") for i in range(num_validators)]
    responses = [ValidationResult(f"result_{i}") for i in range(num_validators)]
    for i in range(num_validators):
        attrs_i = {}
        codeflash_output = get_validator_usage_attributes(attrs_i, responses[i], None, validators[i]); result = codeflash_output # 236μs -> 215μs (9.46% faster)

def test_large_scale_many_attrs_keys():
    """Large scale: attrs dict with many unrelated keys, should preserve them."""
    attrs = {f"key_{i}": i for i in range(500)}
    validator = DummyValidator("big_validator", "big_fail")
    response = ValidationResult("big_result")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.98μs -> 2.35μs (27.0% faster)
    for i in range(500):
        pass

def test_large_scale_many_calls():
    """Large scale: call function many times to check for performance and determinism."""
    validator = DummyValidator("repeat_validator", "repeat_fail")
    response = ValidationResult("repeat_result")
    for _ in range(1000):
        attrs = {}
        codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 452μs -> 412μs (9.50% faster)

def test_large_scale_validator_self_is_none_many_times():
    """Large scale: validator_self is None in many calls."""
    response = ValidationResult("none_result")
    for _ in range(1000):
        attrs = {}
        codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 412μs -> 368μs (12.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Any, Dict, List, Optional, Tuple, Union

# imports
import pytest
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes


# Minimal ValidationResult class for testing
class ValidationResult:
    def __init__(self, outcome):
        self.outcome = outcome

# Minimal validator service class for testing
class DummyValidator:
    def __init__(self, rail_alias, on_fail_descriptor):
        self.rail_alias = rail_alias
        self.on_fail_descriptor = on_fail_descriptor
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes

# unit tests

# ------------------ BASIC TEST CASES ------------------

def test_basic_validator_and_response():
    """Basic case: validator and response are present, attrs is empty dict."""
    attrs = {}
    validator = DummyValidator("TestValidator", "fail_action")
    response = ValidationResult("passed")
    # args: [None, validator]
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.92μs -> 2.44μs (19.9% faster)

def test_basic_attrs_prepopulated():
    """Basic case: attrs dict already has unrelated data."""
    attrs = {"existing_key": 123}
    validator = DummyValidator("VName", "fail_desc")
    response = ValidationResult("failed")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.87μs -> 2.22μs (29.5% faster)

def test_basic_no_validator():
    """Basic case: validator is missing (args too short), response is present."""
    attrs = {}
    response = ValidationResult("ok")
    codeflash_output = get_validator_usage_attributes(attrs, response); result = codeflash_output # 5.29μs -> 2.45μs (116% faster)

def test_basic_no_response():
    """Basic case: validator present, response is None."""
    attrs = {}
    validator = DummyValidator("Name", "desc")
    codeflash_output = get_validator_usage_attributes(attrs, None, None, validator); result = codeflash_output # 1.70μs -> 1.46μs (16.1% faster)

def test_basic_no_validator_no_response():
    """Basic case: neither validator nor response present."""
    attrs = {}
    codeflash_output = get_validator_usage_attributes(attrs, None); result = codeflash_output # 4.05μs -> 1.61μs (151% faster)

# ------------------ EDGE TEST CASES ------------------

def test_edge_validator_is_none():
    """Edge: validator is explicitly None in args."""
    attrs = {}
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 2.65μs -> 2.14μs (24.1% faster)

def test_edge_response_not_validationresult():
    """Edge: response is not a ValidationResult, should set validator_result to None."""
    attrs = {}
    validator = DummyValidator("Edge", "fail")
    response = "not_a_validation_result"
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.79μs -> 2.24μs (24.5% faster)

def test_edge_attrs_is_nonempty_with_conflicting_keys():
    """Edge: attrs already has keys that will be overwritten."""
    attrs = {
        "validator_name": "OldName",
        "validator_on_fail": "OldFail",
        "validator_result": "OldResult"
    }
    validator = DummyValidator("NewName", "NewFail")
    response = ValidationResult("NewResult")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.74μs -> 2.17μs (26.6% faster)

def test_edge_args_is_empty_tuple():
    """Edge: args is empty tuple, should not set validator keys."""
    attrs = {}
    response = ValidationResult("ok")
    codeflash_output = get_validator_usage_attributes(attrs, response); result = codeflash_output # 5.17μs -> 2.49μs (108% faster)


def test_edge_response_is_none_and_attrs_prepopulated():
    """Edge: response is None, attrs has validator_result already."""
    attrs = {"validator_result": "old"}
    validator = DummyValidator("Name", "desc")
    codeflash_output = get_validator_usage_attributes(attrs, None, None, validator); result = codeflash_output # 1.73μs -> 1.43μs (21.0% faster)

def test_edge_validator_in_different_position():
    """Edge: validator appears in args[0] instead of args[1], should not be picked up."""
    attrs = {}
    validator = DummyValidator("WrongPos", "fail")
    response = ValidationResult("ok")
    # validator is at args[0], not args[1]
    codeflash_output = get_validator_usage_attributes(attrs, response, validator); result = codeflash_output # 8.64μs -> 2.54μs (240% faster)

# ------------------ LARGE SCALE TEST CASES ------------------

def test_large_scale_many_attrs_keys():
    """Large scale: attrs dict with many unrelated keys."""
    attrs = {f"key_{i}": i for i in range(500)}
    validator = DummyValidator("LargeValidator", "large_fail")
    response = ValidationResult("large_pass")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.80μs -> 2.19μs (28.2% faster)
    # All original keys should remain
    for i in range(500):
        pass

def test_large_scale_many_calls():
    """Large scale: repeated calls with different validators and responses."""
    attrs = {}
    validators = [DummyValidator(f"V{i}", f"fail_{i}") for i in range(100)]
    responses = [ValidationResult(f"outcome_{i}") for i in range(100)]
    for i in range(100):
        attrs = {}
        codeflash_output = get_validator_usage_attributes(attrs, responses[i], None, validators[i]); result = codeflash_output # 50.9μs -> 45.7μs (11.4% faster)

def test_large_scale_validator_args_list():
    """Large scale: args is a long list, validator at position 1."""
    attrs = {}
    validator = DummyValidator("LongListValidator", "long_fail")
    response = ValidationResult("long_pass")
    args = [None] + [validator] + [None]*997  # total length 999
    codeflash_output = get_validator_usage_attributes(attrs, response, *args); result = codeflash_output # 4.72μs -> 4.44μs (6.24% faster)

def test_large_scale_multiple_attrs_conflicts():
    """Large scale: attrs dict with many conflicting keys."""
    attrs = {f"validator_name": "old", f"validator_on_fail": "old", f"validator_result": "old"}
    for i in range(997):
        attrs[f"key_{i}"] = i
    validator = DummyValidator("NewLarge", "new_large_fail")
    response = ValidationResult("new_large_pass")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.85μs -> 2.32μs (23.1% faster)
    for i in range(997):
        pass

def test_large_scale_validator_none_everywhere():
    """Large scale: args is long list of None, no validator should be set."""
    attrs = {}
    response = ValidationResult("none_pass")
    args = [None]*1000
    codeflash_output = get_validator_usage_attributes(attrs, response, *args); result = codeflash_output # 4.39μs -> 3.77μs (16.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_testsunit_teststest_guard_log_py_testsintegration_teststest_guard_py_testsunit_testsvalidator__replay_test_0.py::test_guardrails_hub_telemetry_hub_tracing_get_validator_usage_attributes 667μs 443μs 50.3%✅

To edit these changes git checkout codeflash/optimize-get_validator_usage_attributes-mh1p5ojy and push.

Codeflash

The optimization adds a **specialized fast path for tuple indexing** in the `safe_get` function. Instead of always falling back to the expensive `safe_get_with_brackets` function for non-dict containers, the optimized version directly handles tuples with a simple `try/except` block.

**Key changes:**
- Added `elif isinstance(container, tuple):` check with direct `container[key]` access
- This avoids the overhead of calling `safe_get_with_brackets`, which includes debug logging and additional exception handling logic

**Why it's faster:**
The line profiler shows that `safe_get` was spending 80.7% of its time in `safe_get_with_brackets` calls. Since `args` is a tuple in the main use case (`safe_get(args, 1)`), the optimization eliminates this expensive function call and replaces it with direct tuple indexing, reducing total time from 3.95ms to 2.72ms (31% faster in `safe_get` alone).

**Test case performance:**
- **Best improvements** (100%+ faster): Cases with missing or short `args` tuples, where the original code unnecessarily called the expensive fallback function
- **Consistent gains** (20-30% faster): Standard cases with valid tuple access, benefiting from eliminated function call overhead
- **Smaller gains** (6-12% faster): Large-scale tests where other operations dominate, but still show measurable improvement

The optimization is particularly effective for the common telemetry pattern of accessing validator objects from argument tuples.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 22, 2025 07:54
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants