Fix: Ollama chunk parser fails to handle complete reasoning blocks in single chunk #15166
+175
−13
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix: Ollama chunk parser fails to handle complete reasoning blocks in single chunk
Problem
The original Ollama streaming chunk parser could not handle complete
<think>...</think>
blocks within a single chunk. This caused critical issues:Primary Issue - Complete Reasoning Blocks:
Secondary Issues - Mixed Content:
<think>
tag:"Regular text <think>thought"
→ "Regular text" was lost or misclassified</think>
tag:"thought</think> more text"
→ "more text" was lost or misclassifiedRoot Cause
The original implementation only checked for individual tags (
<think>
OR</think>
), never both simultaneously. It couldn't detect when a complete reasoning block existed in one chunk, leading to incorrect content classification.Solution
Modified
litellm/llms/ollama/chat/transformation.py
:Added complete block detection: Now checks for BOTH tags in same chunk first
Proper content splitting for partial tags:
<think>
only: Split to separate content before tag from reasoning after</think>
only: Split to separate reasoning before tag from content afterRegex-based whitespace normalization: Cleanly removes tags and extra spaces in complete blocks
Added comprehensive test suite
tests/llm_responses_api_testing/test_ollama_chunk_parser.py
:New test file with 7 test cases:
test_full_think_block_in_one_chunk
- Main fix: Complete block handlingtest_content_before_and_after_think_tag
- Complete block with surrounding texttest_think_tags_split_across_chunks
- Multi-chunk reasoningtest_content_before_think_tag
- Mixed content before tagtest_content_after_think_end_tag
- Mixed content after tagtest_mixed_content_across_multiple_chunks
- Complex multi-chunk scenariotest_whitespace_chunks
- Edge case handlingAll tests passing ✅ | Code quality checks passing ✅
Before/After Comparison
Scenario 1: Complete reasoning block (PRIMARY FIX)
Scenario 2: Content before tag
Scenario 3: Content after tag
Impact
<think>
tagsFiles Changed
litellm/llms/ollama/chat/transformation.py
- Fixed chunk_parser to detect complete blockstests/llm_responses_api_testing/test_ollama_chunk_parser.py
- New test suite (7 tests, 144 lines)Testing
All 7 tests pass:
$ pytest tests/llm_responses_api_testing/test_ollama_chunk_parser.py -v ============== 7 passed in 8.27s ==============
Linter checks pass:
$ ruff check tests/llm_responses_api_testing/test_ollama_chunk_parser.py # No errors