Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Oct 22, 2025

📄 101% (1.01x) speedup for parse_use_many_validator in guardrails/utils/validator_utils.py

⏱️ Runtime : 593 microseconds 295 microseconds (best of 75 runs)

📝 Explanation and details

The optimized code achieves a 101% speedup through two key optimizations:

1. Faster type checking in both functions:

  • Replaced isinstance(container, dict) with type(container) is dict in safe_get
  • Replaced isinstance(args, Dict) and isinstance(args, List) with type(args) is dict and type(args) is not list in parse_use_many_validator
  • type() checks are significantly faster than isinstance() calls, as seen in the profiler results where type checking overhead dropped dramatically

2. Direct list/tuple access in safe_get:

  • Added a fast-path for list/tuple containers that directly uses bracket notation container[key] with try/except handling
  • This eliminates the expensive call to safe_get_with_brackets() for common list/tuple cases
  • The profiler shows this path handles 86 calls efficiently with minimal overhead

3. Reduced function call overhead:

  • The original code always called safe_get_with_brackets() for non-dict containers, but the optimized version handles list/tuple cases directly
  • Only falls back to the imported helper function for string containers or when bracket access fails

Performance characteristics from tests:

  • Small to medium workloads see 150-200% speedups (most test cases)
  • Large list operations benefit most dramatically (up to 985% faster for large tuple handling)
  • Large dictionary operations see minimal improvement since they weren't the bottleneck
  • Edge cases with type conversions show consistent 60-180% improvements

The optimizations are particularly effective because they target the hot paths identified by profiling - type checking and container access operations that occur frequently in the validator parsing workflow.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 39 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Any, Dict, List, Optional, Type

# imports
import pytest
from guardrails.utils.validator_utils import parse_use_many_validator


# --- Function to test (copied from above, self-contained) ---
class Validator:
    """Dummy base class for testing."""
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
from guardrails.utils.validator_utils import parse_use_many_validator


# --- Custom Validator class for testing ---
class MyValidator(Validator):
    pass

# --- Unit Tests ---

# 1. Basic Test Cases

def test_basic_args_and_kwargs():
    # Should correctly parse tuple with args and kwargs
    use_tuple = ('MyValidator', [1, 2], {'a': 3})
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 6.45μs -> 3.86μs (67.0% faster)

def test_basic_args_only():
    # Should correctly parse tuple with only args
    use_tuple = ('MyValidator', [1, 2])
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 9.57μs -> 3.18μs (201% faster)

def test_basic_kwargs_only():
    # Should treat dict at index 1 as kwargs
    use_tuple = ('MyValidator', {'x': 10, 'y': 20})
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 9.82μs -> 3.82μs (157% faster)

def test_basic_single_arg():
    # Should wrap non-list arg in a list
    use_tuple = ('MyValidator', 42)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 8.49μs -> 3.22μs (164% faster)

def test_basic_no_args_kwargs():
    # Should default to empty args and kwargs
    use_tuple = ('MyValidator',)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 8.90μs -> 3.16μs (182% faster)

# 2. Edge Test Cases

def test_edge_none_args():
    # Should treat None as empty args
    use_tuple = ('MyValidator', None)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 8.02μs -> 2.98μs (169% faster)

def test_edge_none_kwargs():
    # Should treat None at index 2 as empty kwargs
    use_tuple = ('MyValidator', [1,2], None)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 4.71μs -> 2.52μs (87.4% faster)

def test_edge_empty_tuple():
    # Should handle empty tuple gracefully
    use_tuple = ()
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 8.18μs -> 3.16μs (159% faster)


def test_edge_args_is_tuple():
    # Should wrap tuple at index 1 in a list
    use_tuple = ('MyValidator', (1,2))
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 9.64μs -> 3.91μs (147% faster)

def test_edge_args_is_str():
    # Should wrap string at index 1 in a list
    use_tuple = ('MyValidator', 'foo')
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 8.36μs -> 3.19μs (162% faster)


def test_edge_args_is_empty_list():
    # Should handle empty list as args
    use_tuple = ('MyValidator', [])
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 9.75μs -> 3.65μs (167% faster)

def test_edge_args_is_empty_dict():
    # Should treat empty dict as kwargs
    use_tuple = ('MyValidator', {})
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 8.61μs -> 3.17μs (172% faster)

def test_edge_kwargs_is_empty_dict():
    # Should handle empty dict as kwargs
    use_tuple = ('MyValidator', [1], {})
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 4.76μs -> 2.64μs (80.7% faster)

def test_edge_kwargs_is_none_and_args_is_dict():
    # Should treat args as dict and kwargs as None
    use_tuple = ('MyValidator', {'foo': 'bar'}, None)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 4.57μs -> 2.86μs (59.4% faster)

def test_edge_validator_cls_is_none():
    # Should return None if validator_cls is None
    use_tuple = ('MyValidator', [1,2], {'a': 3})
    codeflash_output = parse_use_many_validator(None, use_tuple); inst = codeflash_output # 3.73μs -> 1.80μs (107% faster)

# 3. Large Scale Test Cases

def test_large_args_list():
    # Should handle large args list
    large_args = list(range(1000))
    use_tuple = ('MyValidator', large_args)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 38.2μs -> 6.75μs (466% faster)

def test_large_kwargs_dict():
    # Should handle large kwargs dict
    large_kwargs = {f'key{i}': i for i in range(1000)}
    use_tuple = ('MyValidator', [1,2], large_kwargs)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 44.0μs -> 43.9μs (0.319% faster)

def test_large_args_and_kwargs():
    # Should handle large args and large kwargs together
    large_args = list(range(500))
    large_kwargs = {f'key{i}': i for i in range(500)}
    use_tuple = ('MyValidator', large_args, large_kwargs)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 27.3μs -> 25.7μs (6.19% faster)

def test_large_args_is_dict():
    # Should treat large dict at index 1 as kwargs
    large_kwargs = {f'key{i}': i for i in range(1000)}
    use_tuple = ('MyValidator', large_kwargs)
    codeflash_output = parse_use_many_validator(MyValidator, use_tuple); inst = codeflash_output # 112μs -> 44.2μs (154% faster)


#------------------------------------------------
from typing import Any, Dict, List, Optional, Tuple, Type

# imports
import pytest
from guardrails.utils.validator_utils import parse_use_many_validator


# Dummy Validator class for testing
class DummyValidator:
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs
from guardrails.utils.validator_utils import parse_use_many_validator

# ------------------ UNIT TESTS ------------------

# Basic Test Cases

def test_basic_args_and_kwargs():
    # Should pass positional and keyword arguments correctly
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", [1, 2], {"a": 3})); validator = codeflash_output # 5.73μs -> 3.39μs (68.7% faster)

def test_basic_args_only():
    # Should pass positional arguments only
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", [1, 2])); validator = codeflash_output # 9.56μs -> 3.34μs (186% faster)

def test_basic_kwargs_only():
    # Should pass keyword arguments only
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", {"x": 1})); validator = codeflash_output # 8.98μs -> 3.28μs (174% faster)

def test_basic_single_arg():
    # Should wrap a single argument in a list
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", 7)); validator = codeflash_output # 8.08μs -> 3.22μs (151% faster)

def test_basic_no_args():
    # Should work with no args or kwargs
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy",)); validator = codeflash_output # 8.65μs -> 3.07μs (182% faster)

# Edge Test Cases

def test_edge_empty_tuple():
    # Should work with completely empty tuple
    codeflash_output = parse_use_many_validator(DummyValidator, ()); validator = codeflash_output # 7.56μs -> 3.07μs (147% faster)

def test_edge_none_args():
    # Should treat None as a single argument
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", None)); validator = codeflash_output # 8.15μs -> 2.99μs (172% faster)

def test_edge_none_kwargs():
    # Should treat None as empty kwargs
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", [1], None)); validator = codeflash_output # 4.74μs -> 2.69μs (76.3% faster)

def test_edge_args_is_dict_and_kwargs_is_dict():
    # Should prioritize kwargs from index 2
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", {"x": 1}, {"y": 2})); validator = codeflash_output # 4.35μs -> 2.78μs (56.3% faster)

def test_edge_args_is_dict_and_no_kwargs():
    # Should use dict as kwargs if no explicit kwargs
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", {"x": 1})); validator = codeflash_output # 8.88μs -> 3.29μs (170% faster)

def test_edge_args_is_str():
    # Should wrap string in a list
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", "foo")); validator = codeflash_output # 8.04μs -> 3.19μs (152% faster)

def test_edge_args_is_tuple():
    # Should wrap tuple in a list (not split up)
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", (1, 2))); validator = codeflash_output # 8.42μs -> 3.26μs (158% faster)


def test_edge_validator_cls_is_none():
    # Should return None if validator_cls is None
    codeflash_output = parse_use_many_validator(None, ("Dummy", [1, 2], {"a": 3})); result = codeflash_output # 4.21μs -> 1.93μs (118% faster)

def test_edge_args_is_empty_list_and_kwargs_is_empty_dict():
    # Should work with empty list and dict
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", [], {})); validator = codeflash_output # 4.59μs -> 2.59μs (77.2% faster)

def test_edge_args_is_falsey_value():
    # Should wrap 0 as a single argument
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", 0)); validator = codeflash_output # 8.99μs -> 2.97μs (203% faster)

# Large Scale Test Cases

def test_large_args_list():
    # Should handle a large number of positional arguments
    large_args = list(range(1000))
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", large_args)); validator = codeflash_output # 37.6μs -> 6.69μs (462% faster)

def test_large_kwargs_dict():
    # Should handle a large number of keyword arguments
    large_kwargs = {f"key{i}": i for i in range(1000)}
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", [], large_kwargs)); validator = codeflash_output # 44.8μs -> 43.5μs (2.99% faster)

def test_large_args_and_kwargs():
    # Should handle large args and kwargs together
    large_args = list(range(500))
    large_kwargs = {f"key{i}": i for i in range(500)}
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", large_args, large_kwargs)); validator = codeflash_output # 27.2μs -> 25.9μs (5.01% faster)

def test_large_tuple_input():
    # Should handle a tuple of length >3, but ignore extra elements
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", [1, 2], {"a": 3}, "extra", "extra2")); validator = codeflash_output # 4.74μs -> 2.82μs (68.2% faster)

def test_large_args_is_tuple_of_length_1000():
    # Should wrap tuple in a list, not unpack it
    large_tuple = tuple(range(1000))
    codeflash_output = parse_use_many_validator(DummyValidator, ("Dummy", large_tuple)); validator = codeflash_output # 34.5μs -> 3.18μs (985% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-parse_use_many_validator-mh1m2qok and push.

Codeflash

The optimized code achieves a **101% speedup** through two key optimizations:

**1. Faster type checking in both functions:**
- Replaced `isinstance(container, dict)` with `type(container) is dict` in `safe_get`
- Replaced `isinstance(args, Dict)` and `isinstance(args, List)` with `type(args) is dict` and `type(args) is not list` in `parse_use_many_validator`
- `type()` checks are significantly faster than `isinstance()` calls, as seen in the profiler results where type checking overhead dropped dramatically

**2. Direct list/tuple access in `safe_get`:**
- Added a fast-path for list/tuple containers that directly uses bracket notation `container[key]` with try/except handling
- This eliminates the expensive call to `safe_get_with_brackets()` for common list/tuple cases
- The profiler shows this path handles 86 calls efficiently with minimal overhead

**3. Reduced function call overhead:**
- The original code always called `safe_get_with_brackets()` for non-dict containers, but the optimized version handles list/tuple cases directly
- Only falls back to the imported helper function for string containers or when bracket access fails

**Performance characteristics from tests:**
- Small to medium workloads see **150-200% speedups** (most test cases)
- Large list operations benefit most dramatically (up to **985% faster** for large tuple handling)
- Large dictionary operations see minimal improvement since they weren't the bottleneck
- Edge cases with type conversions show consistent **60-180% improvements**

The optimizations are particularly effective because they target the hot paths identified by profiling - type checking and container access operations that occur frequently in the validator parsing workflow.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 22, 2025 06:27
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants