⚡️ Speed up function replace_til_no_change
by 8%
#52
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 8% (0.08x) speedup for
replace_til_no_change
inguardrails/utils/tokenization_utils.py
⏱️ Runtime :
6.46 milliseconds
→6.00 milliseconds
(best of5
runs)📝 Explanation and details
The optimization precompiles the regular expression pattern once before entering the loop, rather than recompiling it on every iteration.
Key change: Added
compiled = re.compile(pattern) if not isinstance(pattern, re.Pattern) else pattern
to check if the pattern is already compiled, and then usescompiled.sub()
instead ofre.sub()
.Why it's faster: Python's
re.sub()
internally compiles the pattern on every call, which becomes expensive when called repeatedly in a loop. By compiling once and reusing the compiled pattern object, we eliminate this redundant compilation overhead.Performance impact: The 7% speedup is most pronounced in test cases with:
"aaaa"
→"aaa"
→"aa"
→"a"
shows 12.6-17.1% improvement)The optimization has minimal impact on simple cases with few iterations, but provides significant gains when the loop executes many times, which is exactly when the compilation overhead becomes a bottleneck.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
⏪ Replay Tests and Runtime
test_pytest_testsunit_teststest_guard_log_py_testsintegration_teststest_guard_py_testsunit_testsvalidator__replay_test_0.py::test_guardrails_utils_tokenization_utils_replace_til_no_change
To edit these changes
git checkout codeflash/optimize-replace_til_no_change-mh1olut6
and push.