⚡️ Speed up function `get_validator_usage_attributes` by 23% #55

codeflash-ai · 2025-10-22T07:53:58Z

📄 23% (0.23x) speedup for `get_validator_usage_attributes` in `guardrails/hub_telemetry/hub_tracing.py`

⏱️ Runtime : 1.91 milliseconds → 1.55 milliseconds (best of 79 runs)

📝 Explanation and details

The optimization adds a specialized fast path for tuple indexing in the safe_get function. Instead of always falling back to the expensive safe_get_with_brackets function for non-dict containers, the optimized version directly handles tuples with a simple try/except block.

Key changes:

Added elif isinstance(container, tuple): check with direct container[key] access
This avoids the overhead of calling safe_get_with_brackets, which includes debug logging and additional exception handling logic

Why it's faster:
The line profiler shows that safe_get was spending 80.7% of its time in safe_get_with_brackets calls. Since args is a tuple in the main use case (safe_get(args, 1)), the optimization eliminates this expensive function call and replaces it with direct tuple indexing, reducing total time from 3.95ms to 2.72ms (31% faster in safe_get alone).

Test case performance:

Best improvements (100%+ faster): Cases with missing or short args tuples, where the original code unnecessarily called the expensive fallback function
Consistent gains (20-30% faster): Standard cases with valid tuple access, benefiting from eliminated function call overhead
Smaller gains (6-12% faster): Large-scale tests where other operations dominate, but still show measurable improvement

The optimization is particularly effective for the common telemetry pattern of accessing validator objects from argument tuples.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 2627 Passed
⏪ Replay Tests	✅ 255 Passed
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	71.4%

🌀 Generated Regression Tests and Runtime

from typing import Any, Dict, List, Optional, Tuple, Union

# imports
import pytest
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes


# --- Dummy ValidationResult for testing ---
class ValidationResult:
    def __init__(self, outcome):
        self.outcome = outcome

# --- Dummy Validator Service for testing ---
class DummyValidator:
    def __init__(self, rail_alias, on_fail_descriptor):
        self.rail_alias = rail_alias
        self.on_fail_descriptor = on_fail_descriptor
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes

# --- Unit Tests ---

# 1. Basic Test Cases

def test_basic_with_validator_and_validation_result():
    """Basic: validator_self present, response is ValidationResult."""
    attrs = {}
    validator = DummyValidator("my_validator", "fail_action")
    response = ValidationResult("passed")
    # args: [None, validator]
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 3.12μs -> 2.52μs (23.6% faster)

def test_basic_with_validator_and_non_validation_result():
    """Basic: validator_self present, response is not ValidationResult."""
    attrs = {}
    validator = DummyValidator("my_validator", "fail_action")
    response = "not_a_validation_result"
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.79μs -> 2.33μs (19.7% faster)

def test_basic_without_validator_with_validation_result():
    """Basic: validator_self missing, response is ValidationResult."""
    attrs = {}
    response = ValidationResult("failed")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 2.61μs -> 2.20μs (18.3% faster)

def test_basic_without_validator_and_response_none():
    """Basic: validator_self missing, response is None."""
    attrs = {}
    codeflash_output = get_validator_usage_attributes(attrs, None, None, None); result = codeflash_output # 1.40μs -> 1.25μs (12.4% faster)

def test_basic_with_prepopulated_attrs():
    """Basic: attrs dict prepopulated, should update/overwrite relevant keys."""
    attrs = {"validator_name": "old", "validator_on_fail": "oldfail", "validator_result": "oldresult"}
    validator = DummyValidator("new_validator", "newfail")
    response = ValidationResult("newresult")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.78μs -> 2.26μs (22.7% faster)

# 2. Edge Test Cases

def test_edge_args_too_short():
    """Edge: args has fewer than 2 elements, so validator_self is None."""
    attrs = {}
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None); result = codeflash_output # 6.58μs -> 2.57μs (156% faster)

def test_edge_validator_self_is_none():
    """Edge: validator_self is explicitly None."""
    attrs = {}
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 2.48μs -> 2.07μs (19.6% faster)

def test_edge_validator_has_falsey_attributes():
    """Edge: validator_self rail_alias/on_fail_descriptor are falsey values."""
    attrs = {}
    validator = DummyValidator("", None)
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.76μs -> 2.23μs (23.7% faster)

def test_edge_response_is_none():
    """Edge: response is None, validator_self present."""
    attrs = {}
    validator = DummyValidator("edge_validator", "edge_fail")
    codeflash_output = get_validator_usage_attributes(attrs, None, None, validator); result = codeflash_output # 1.74μs -> 1.46μs (19.4% faster)

def test_edge_response_is_falsey_non_validation_result():
    """Edge: response is falsey but not None, not ValidationResult."""
    attrs = {}
    validator = DummyValidator("edge_validator", "edge_fail")
    response = ""
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.82μs -> 2.25μs (25.3% faster)


def test_edge_attrs_is_not_empty_dict():
    """Edge: attrs dict has extra unrelated keys, should preserve them."""
    attrs = {"foo": "bar"}
    validator = DummyValidator("name", "fail")
    response = ValidationResult("outcome")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.85μs -> 2.36μs (20.7% faster)

# 3. Large Scale Test Cases

def test_large_scale_many_validators():
    """Large scale: test with many different validators and attrs."""
    attrs = {}
    num_validators = 500
    validators = [DummyValidator(f"validator_{i}", f"fail_{i}") for i in range(num_validators)]
    responses = [ValidationResult(f"result_{i}") for i in range(num_validators)]
    for i in range(num_validators):
        attrs_i = {}
        codeflash_output = get_validator_usage_attributes(attrs_i, responses[i], None, validators[i]); result = codeflash_output # 236μs -> 215μs (9.46% faster)

def test_large_scale_many_attrs_keys():
    """Large scale: attrs dict with many unrelated keys, should preserve them."""
    attrs = {f"key_{i}": i for i in range(500)}
    validator = DummyValidator("big_validator", "big_fail")
    response = ValidationResult("big_result")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.98μs -> 2.35μs (27.0% faster)
    for i in range(500):
        pass

def test_large_scale_many_calls():
    """Large scale: call function many times to check for performance and determinism."""
    validator = DummyValidator("repeat_validator", "repeat_fail")
    response = ValidationResult("repeat_result")
    for _ in range(1000):
        attrs = {}
        codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 452μs -> 412μs (9.50% faster)

def test_large_scale_validator_self_is_none_many_times():
    """Large scale: validator_self is None in many calls."""
    response = ValidationResult("none_result")
    for _ in range(1000):
        attrs = {}
        codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 412μs -> 368μs (12.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Any, Dict, List, Optional, Tuple, Union

# imports
import pytest
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes


# Minimal ValidationResult class for testing
class ValidationResult:
    def __init__(self, outcome):
        self.outcome = outcome

# Minimal validator service class for testing
class DummyValidator:
    def __init__(self, rail_alias, on_fail_descriptor):
        self.rail_alias = rail_alias
        self.on_fail_descriptor = on_fail_descriptor
from guardrails.hub_telemetry.hub_tracing import get_validator_usage_attributes

# unit tests

# ------------------ BASIC TEST CASES ------------------

def test_basic_validator_and_response():
    """Basic case: validator and response are present, attrs is empty dict."""
    attrs = {}
    validator = DummyValidator("TestValidator", "fail_action")
    response = ValidationResult("passed")
    # args: [None, validator]
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.92μs -> 2.44μs (19.9% faster)

def test_basic_attrs_prepopulated():
    """Basic case: attrs dict already has unrelated data."""
    attrs = {"existing_key": 123}
    validator = DummyValidator("VName", "fail_desc")
    response = ValidationResult("failed")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.87μs -> 2.22μs (29.5% faster)

def test_basic_no_validator():
    """Basic case: validator is missing (args too short), response is present."""
    attrs = {}
    response = ValidationResult("ok")
    codeflash_output = get_validator_usage_attributes(attrs, response); result = codeflash_output # 5.29μs -> 2.45μs (116% faster)

def test_basic_no_response():
    """Basic case: validator present, response is None."""
    attrs = {}
    validator = DummyValidator("Name", "desc")
    codeflash_output = get_validator_usage_attributes(attrs, None, None, validator); result = codeflash_output # 1.70μs -> 1.46μs (16.1% faster)

def test_basic_no_validator_no_response():
    """Basic case: neither validator nor response present."""
    attrs = {}
    codeflash_output = get_validator_usage_attributes(attrs, None); result = codeflash_output # 4.05μs -> 1.61μs (151% faster)

# ------------------ EDGE TEST CASES ------------------

def test_edge_validator_is_none():
    """Edge: validator is explicitly None in args."""
    attrs = {}
    response = ValidationResult("edgecase")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, None); result = codeflash_output # 2.65μs -> 2.14μs (24.1% faster)

def test_edge_response_not_validationresult():
    """Edge: response is not a ValidationResult, should set validator_result to None."""
    attrs = {}
    validator = DummyValidator("Edge", "fail")
    response = "not_a_validation_result"
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.79μs -> 2.24μs (24.5% faster)

def test_edge_attrs_is_nonempty_with_conflicting_keys():
    """Edge: attrs already has keys that will be overwritten."""
    attrs = {
        "validator_name": "OldName",
        "validator_on_fail": "OldFail",
        "validator_result": "OldResult"
    }
    validator = DummyValidator("NewName", "NewFail")
    response = ValidationResult("NewResult")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.74μs -> 2.17μs (26.6% faster)

def test_edge_args_is_empty_tuple():
    """Edge: args is empty tuple, should not set validator keys."""
    attrs = {}
    response = ValidationResult("ok")
    codeflash_output = get_validator_usage_attributes(attrs, response); result = codeflash_output # 5.17μs -> 2.49μs (108% faster)


def test_edge_response_is_none_and_attrs_prepopulated():
    """Edge: response is None, attrs has validator_result already."""
    attrs = {"validator_result": "old"}
    validator = DummyValidator("Name", "desc")
    codeflash_output = get_validator_usage_attributes(attrs, None, None, validator); result = codeflash_output # 1.73μs -> 1.43μs (21.0% faster)

def test_edge_validator_in_different_position():
    """Edge: validator appears in args[0] instead of args[1], should not be picked up."""
    attrs = {}
    validator = DummyValidator("WrongPos", "fail")
    response = ValidationResult("ok")
    # validator is at args[0], not args[1]
    codeflash_output = get_validator_usage_attributes(attrs, response, validator); result = codeflash_output # 8.64μs -> 2.54μs (240% faster)

# ------------------ LARGE SCALE TEST CASES ------------------

def test_large_scale_many_attrs_keys():
    """Large scale: attrs dict with many unrelated keys."""
    attrs = {f"key_{i}": i for i in range(500)}
    validator = DummyValidator("LargeValidator", "large_fail")
    response = ValidationResult("large_pass")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.80μs -> 2.19μs (28.2% faster)
    # All original keys should remain
    for i in range(500):
        pass

def test_large_scale_many_calls():
    """Large scale: repeated calls with different validators and responses."""
    attrs = {}
    validators = [DummyValidator(f"V{i}", f"fail_{i}") for i in range(100)]
    responses = [ValidationResult(f"outcome_{i}") for i in range(100)]
    for i in range(100):
        attrs = {}
        codeflash_output = get_validator_usage_attributes(attrs, responses[i], None, validators[i]); result = codeflash_output # 50.9μs -> 45.7μs (11.4% faster)

def test_large_scale_validator_args_list():
    """Large scale: args is a long list, validator at position 1."""
    attrs = {}
    validator = DummyValidator("LongListValidator", "long_fail")
    response = ValidationResult("long_pass")
    args = [None] + [validator] + [None]*997  # total length 999
    codeflash_output = get_validator_usage_attributes(attrs, response, *args); result = codeflash_output # 4.72μs -> 4.44μs (6.24% faster)

def test_large_scale_multiple_attrs_conflicts():
    """Large scale: attrs dict with many conflicting keys."""
    attrs = {f"validator_name": "old", f"validator_on_fail": "old", f"validator_result": "old"}
    for i in range(997):
        attrs[f"key_{i}"] = i
    validator = DummyValidator("NewLarge", "new_large_fail")
    response = ValidationResult("new_large_pass")
    codeflash_output = get_validator_usage_attributes(attrs, response, None, validator); result = codeflash_output # 2.85μs -> 2.32μs (23.1% faster)
    for i in range(997):
        pass

def test_large_scale_validator_none_everywhere():
    """Large scale: args is long list of None, no validator should be set."""
    attrs = {}
    response = ValidationResult("none_pass")
    args = [None]*1000
    codeflash_output = get_validator_usage_attributes(attrs, response, *args); result = codeflash_output # 4.39μs -> 3.77μs (16.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

⏪ Replay Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`test_pytest_testsunit_teststest_guard_log_py_testsintegration_teststest_guard_py_testsunit_testsvalidator__replay_test_0.py::test_guardrails_hub_telemetry_hub_tracing_get_validator_usage_attributes`	667μs	443μs	50.3%✅

To edit these changes git checkout codeflash/optimize-get_validator_usage_attributes-mh1p5ojy and push.

The optimization adds a **specialized fast path for tuple indexing** in the `safe_get` function. Instead of always falling back to the expensive `safe_get_with_brackets` function for non-dict containers, the optimized version directly handles tuples with a simple `try/except` block. **Key changes:** - Added `elif isinstance(container, tuple):` check with direct `container[key]` access - This avoids the overhead of calling `safe_get_with_brackets`, which includes debug logging and additional exception handling logic **Why it's faster:** The line profiler shows that `safe_get` was spending 80.7% of its time in `safe_get_with_brackets` calls. Since `args` is a tuple in the main use case (`safe_get(args, 1)`), the optimization eliminates this expensive function call and replaces it with direct tuple indexing, reducing total time from 3.95ms to 2.72ms (31% faster in `safe_get` alone). **Test case performance:** - **Best improvements** (100%+ faster): Cases with missing or short `args` tuples, where the original code unnecessarily called the expensive fallback function - **Consistent gains** (20-30% faster): Standard cases with valid tuple access, benefiting from eliminated function call overhead - **Smaller gains** (6-12% faster): Large-scale tests where other operations dominate, but still show measurable improvement The optimization is particularly effective for the common telemetry pattern of accessing validator objects from argument tuples.

codeflash-ai bot requested a review from mashraf-222 October 22, 2025 07:54

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `get_validator_usage_attributes` by 23% #55

⚡️ Speed up function `get_validator_usage_attributes` by 23% #55

Uh oh!

codeflash-ai bot commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function get_validator_usage_attributes by 23% #55

Are you sure you want to change the base?

⚡️ Speed up function get_validator_usage_attributes by 23% #55

Uh oh!

Conversation

codeflash-ai bot commented Oct 22, 2025

📄 23% (0.23x) speedup for get_validator_usage_attributes in guardrails/hub_telemetry/hub_tracing.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

⚡️ Speed up function `get_validator_usage_attributes` by 23% #55

⚡️ Speed up function `get_validator_usage_attributes` by 23% #55

📄 23% (0.23x) speedup for `get_validator_usage_attributes` in `guardrails/hub_telemetry/hub_tracing.py`