Skip to content

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Oct 22, 2025

📄 5% (0.05x) speedup for TextSplitter.prompt_template_token_length in guardrails/utils/docs_utils.py

⏱️ Runtime : 13.7 microseconds 13.0 microseconds (best of 24 runs)

📝 Explanation and details

The optimization introduces a caching mechanism that eliminates redundant template variable parsing.

Key changes:

  • Variable caching in BasePrompt: The variable_names are computed once during initialization using get_template_variables() and stored as an instance variable, rather than being recalculated on every call to get_prompt_variables().
  • Direct cache access in Prompt.format(): Instead of calling get_template_variables(self.source) each time, the method now directly uses the cached self.variable_names.

Why this improves performance:
Template variable extraction involves parsing the template string to identify placeholder variables (e.g., ${variable_name}). This parsing operation has computational overhead that scales with template complexity. By caching the results during object initialization, subsequent calls to get_prompt_variables() and format() become simple attribute lookups instead of string parsing operations.

Performance characteristics:
The 5% speedup is most pronounced in scenarios where:

  • Prompt objects are reused multiple times (common in production workflows)
  • Templates contain multiple variables requiring extraction
  • The format() method is called repeatedly on the same prompt instance

The line profiler shows that while the get_prompt_variables() call itself becomes slightly faster (15% vs 13.6% of total time), the overall benefit comes from eliminating redundant parsing work across the entire prompt processing pipeline.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 6 Passed
🌀 Generated Regression Tests 🔘 None Found
⏪ Replay Tests 1 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
unit_tests/utils/test_docs_utils.py::test_prompt_template_token_length 6.54μs 6.31μs 3.74%✅
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_testsunit_teststest_guard_log_py_testsintegration_teststest_guard_py_testsunit_testsvalidator__replay_test_0.py::test_guardrails_utils_docs_utils_TextSplitter_prompt_template_token_length 7.11μs 6.68μs 6.48%✅

To edit these changes git checkout codeflash/optimize-TextSplitter.prompt_template_token_length-mh1qq7nx and push.

Codeflash

The optimization introduces a **caching mechanism** that eliminates redundant template variable parsing. 

**Key changes:**
- **Variable caching in BasePrompt**: The `variable_names` are computed once during initialization using `get_template_variables()` and stored as an instance variable, rather than being recalculated on every call to `get_prompt_variables()`.
- **Direct cache access in Prompt.format()**: Instead of calling `get_template_variables(self.source)` each time, the method now directly uses the cached `self.variable_names`.

**Why this improves performance:**
Template variable extraction involves parsing the template string to identify placeholder variables (e.g., `${variable_name}`). This parsing operation has computational overhead that scales with template complexity. By caching the results during object initialization, subsequent calls to `get_prompt_variables()` and `format()` become simple attribute lookups instead of string parsing operations.

**Performance characteristics:**
The 5% speedup is most pronounced in scenarios where:
- Prompt objects are reused multiple times (common in production workflows)
- Templates contain multiple variables requiring extraction
- The `format()` method is called repeatedly on the same prompt instance

The line profiler shows that while the `get_prompt_variables()` call itself becomes slightly faster (15% vs 13.6% of total time), the overall benefit comes from eliminating redundant parsing work across the entire prompt processing pipeline.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 22, 2025 08:37
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants