fix: ensure role='assistant' in Azure streaming with include_usage by majiayu000 · Pull Request #24326 · BerriAI/litellm

majiayu000 · 2026-03-21T21:50:04Z

Relevant issues

Fixes #24221 — LiteLLM proxy doesn't include role for /chat/completions stream=true with Azure OpenAI and stream_options.include_usage=true

What this PR does

Root cause: In streaming_handler.py chunk_creator(), when original_chunk has no choices (Azure's prompt_filter_results chunk) and include_usage=True, the code returns model_response without calling strip_role_from_delta(). This means:

The empty-choices chunk has no role in its delta
__next__/__anext__ then sets sent_first_chunk=True
When the actual first content chunk arrives with role='assistant' from Azure, strip_role_from_delta() sees sent_first_chunk=True and strips the role

Net result: no chunk ever has role='assistant'.

Fix: Call self.strip_role_from_delta(model_response) before returning model_response at line 1559. This is consistent with the other return paths in chunk_creator (lines 895 and 989) that already call strip_role_from_delta.

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

litellm/litellm_core_utils/streaming_handler.py: Changed return model_response to return self.strip_role_from_delta(model_response) in the include_usage empty-choices path
tests/test_litellm/litellm_core_utils/test_streaming_handler.py: Added test_azure_streaming_role_with_include_usage covering both sync and async iteration with mock Azure chunks

vercel · 2026-03-21T21:50:10Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 22, 2026 5:21am

codspeed-hq · 2026-03-21T21:52:28Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing majiayu000:fix/issue-24221-azure-streaming-role-missing (1162925) with main (f5194b5)}

When Azure sends an initial chunk with no choices (prompt_filter_results) and stream_options.include_usage=True, the chunk was returned without calling strip_role_from_delta(). This caused sent_first_chunk to be set True prematurely, so the actual first content chunk had its role stripped. Call strip_role_from_delta() on the empty-choices return path, consistent with the other return paths in chunk_creator(). Fixes BerriAI#24221 Signed-off-by: majiayu000 <1835304752@qq.com>

Signed-off-by: majiayu000 <1835304752@qq.com>

Reformat with Black 23.x to match CI's version requirement. The previous formatting commit used a newer Black version. Signed-off-by: majiayu000 <1835304752@qq.com>

Signed-off-by: majiayu000 <1835304752@qq.com>

greptile-apps · 2026-03-22T05:49:25Z

Greptile Summary

This PR fixes a missing role='assistant' in Azure OpenAI streaming responses when stream_options.include_usage=True by calling self.strip_role_from_delta(model_response) on the early-return path for empty-choices chunks, making it consistent with all other return paths in chunk_creator.

Key changes:

streaming_handler.py: One-line fix — return model_response → return self.strip_role_from_delta(model_response) in the include_usage empty-choices branch (line 1559). This is consistent with the two other strip_role_from_delta callsites at lines 895 and 989.
test_streaming_handler.py: Adds test_azure_streaming_role_with_include_usage (sync + async parameterized) reproducing the exact Azure chunk sequence (prompt_filter_results → first content → finish → usage). All other test-file changes are cosmetic reformatting (Black line-length).
audit_logs.py: Purely cosmetic reformatting; zero logic changes.

Correctness notes: model_response_creator() always ensures choices has at least one StreamingChoices entry, so calling strip_role_from_delta on the previously empty-choices path is safe — no IndexError risk. The fix makes the prompt_filter_results chunk the carrier of role='assistant', while the subsequent first real content chunk correctly has its role stripped, which is OpenAI-spec-compliant behavior. The fix has no impact on the final usage-only chunk (comes after content, so sent_first_chunk=True at that point — strip_role_from_delta safely no-ops the pop).

Confidence Score: 4/5

Safe to merge — minimal, well-targeted one-line fix with a new mock-only regression test; no network calls, no backwards-incompatible changes.
The core fix is a single-line change consistent with existing patterns in the same function. model_response_creator guarantees choices[0] is always present, so no IndexError risk. The only minor concern is that the new test assertion is weaker than ideal (checks "any chunk has role" rather than "first chunk has role, others don't"), but the fix itself is correct and the test does catch the original regression.
No files require special attention beyond the minor test-assertion suggestion on test_streaming_handler.py.

Important Files Changed

Filename	Overview
litellm/litellm_core_utils/streaming_handler.py	One-line fix calling `strip_role_from_delta` in the `include_usage` empty-choices path, consistent with other return paths; correctly resolves the Azure `prompt_filter_results` chunk causing `role='assistant'` to be skipped.
tests/test_litellm/litellm_core_utils/test_streaming_handler.py	Adds `test_azure_streaming_role_with_include_usage` covering both sync and async paths using pre-canned mock chunks; other changes are purely cosmetic reformatting; test assertion is slightly weaker than ideal (checks any chunk has role, not first chunk).
litellm/proxy/management_helpers/audit_logs.py	Purely cosmetic reformatting for line length (Black-style); no logic changes.

Sequence Diagram

sequenceDiagram
    participant Azure as Azure OpenAI
    participant CSW as CustomStreamWrapper
    participant Client as API Consumer

    Note over Azure,Client: stream_options.include_usage=True

    Azure->>CSW: Chunk 1: prompt_filter_results (choices=[])
    Note over CSW: chunk_creator: else branch (no choices)<br/>include_usage=True
    Note over CSW: ✅ FIXED: calls strip_role_from_delta()<br/>sent_first_chunk=False → sets role='assistant', sent_first_chunk=True
    CSW->>Client: {choices:[{delta:{role:'assistant'}}]}

    Azure->>CSW: Chunk 2: first content (role='assistant', content='')
    Note over CSW: strip_role_from_delta()<br/>sent_first_chunk=True → strips role
    CSW->>Client: {choices:[{delta:{content:''}}]}

    Azure->>CSW: Chunk 3: content ('Hello')
    CSW->>Client: {choices:[{delta:{content:'Hello'}}]}

    Azure->>CSW: Chunk 4: finish_reason='stop'
    CSW->>Client: {choices:[{finish_reason:'stop', delta:{}}]}

    Azure->>CSW: Chunk 5: usage chunk (choices=[])
    Note over CSW: include_usage=True, strip_role_from_delta()<br/>sent_first_chunk=True → safe no-op pop
    CSW->>Client: {choices:[…], usage:{…}}

_{Last reviewed commit: "style: fix Black for..."}

greptile-apps · 2026-03-22T05:49:29Z

tests/test_litellm/litellm_core_utils/test_streaming_handler.py

+    # At least one chunk must have role='assistant' in its delta
+    has_role = any(
+        hasattr(c, "choices")
+        and len(c.choices) > 0
+        and hasattr(c.choices[0], "delta")
+        and getattr(c.choices[0].delta, "role", None) == "assistant"
+        for c in chunks
+    )
+    assert (
+        has_role
+    ), "No chunk contained role='assistant' in delta. " "Chunk deltas: " + str(
+        [c.choices[0].delta if c.choices else "no choices" for c in chunks]
+    )


Consider a stronger assertion for role placement

The current assertion only verifies that at least one chunk in the entire stream carries role='assistant'. After the fix, the role is attached to the empty prompt_filter_results chunk (the first chunk emitted), while the actual first content chunk has its role stripped by strip_role_from_delta. A stricter test would also confirm that the role appears on the correct chunk (the first yielded chunk) and is absent from later content chunks, preventing a future regression where both chunks could accidentally carry the role:

# Verify role appears in exactly the first emitted chunk assert len(chunks) > 0, "No chunks were yielded" first_chunk = chunks[0] assert ( len(first_chunk.choices) > 0 and getattr(first_chunk.choices[0].delta, "role", None) == "assistant" ), f"Expected role='assistant' in the first chunk, got: {first_chunk.choices[0].delta if first_chunk.choices else 'no choices'}" # Verify subsequent chunks do NOT repeat the role for chunk in chunks[1:]: if chunk.choices: assert getattr(chunk.choices[0].delta, "role", None) != "assistant", \ f"Unexpected role='assistant' in non-first chunk: {chunk.choices[0].delta}"

vercel bot deployed to Preview March 21, 2026 21:51 View deployment

vercel bot deployed to Preview March 21, 2026 22:49 View deployment

vercel bot deployed to Preview March 21, 2026 23:20 View deployment

majiayu000 force-pushed the fix/issue-24221-azure-streaming-role-missing branch from 62a95d5 to a8582fd Compare March 22, 2026 00:19

vercel bot deployed to Preview March 22, 2026 00:21 View deployment

majiayu000 added 4 commits March 22, 2026 13:19

style: apply Black formatting to streaming handler and tests

a6b5e5c

Signed-off-by: majiayu000 <1835304752@qq.com>

fix: use correct Black version formatting for streaming_handler.py

046aff2

Reformat with Black 23.x to match CI's version requirement. The previous formatting commit used a newer Black version. Signed-off-by: majiayu000 <1835304752@qq.com>

style: fix Black formatting for audit_logs.py

1162925

Signed-off-by: majiayu000 <1835304752@qq.com>

majiayu000 force-pushed the fix/issue-24221-azure-streaming-role-missing branch from a8582fd to 1162925 Compare March 22, 2026 05:20

vercel bot deployed to Preview March 22, 2026 05:21 View deployment

majiayu000 marked this pull request as ready for review March 22, 2026 05:45

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: ensure role='assistant' in Azure streaming with include_usage#24326

fix: ensure role='assistant' in Azure streaming with include_usage#24326
majiayu000 wants to merge 4 commits intoBerriAI:mainfrom
majiayu000:fix/issue-24221-azure-streaming-role-missing

majiayu000 commented Mar 21, 2026

Uh oh!

vercel bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 22, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

majiayu000 commented Mar 21, 2026

Relevant issues

What this PR does

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 22, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 21, 2026 •

edited

Loading

codspeed-hq bot commented Mar 21, 2026 •

edited

Loading