- Review the field-by-field mapping rules below when converting
litellm.ModelResponseStreampayloads intolitellm.GenericStreamingChunk. - Each rule cites at least one concrete example chunk from the attached traces so you can quickly reopen the original stream capture if you need to double-check the raw data.
- Preserve nulls/omitted keys as-is unless a rule explicitly calls for a default.
id→ copy verbatim toGenericStreamingChunk.id. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.created→ copy toGenericStreamingChunk.createdwithout transformation. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.model→ populateGenericStreamingChunk.modelwith the same string. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.object→ pass through toGenericStreamingChunk.object. (The examples use"chat.completion.chunk"; keep whatever value arrives.) Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.system_fingerprint→ copy directly toGenericStreamingChunk.system_fingerprint, preservingnull. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.provider_specific_fields(top-level) → forward untouched into the correspondingGenericStreamingChunkfield. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.citations→ expose onGenericStreamingChunk.citations; keep nulls if present. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.usage→ when theModelResponseStreamchunk includes ausageblock, attach it toGenericStreamingChunk.usagewithout altering the numeric counters or nested detail dictionaries. Reference:Response Chunk #106inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.
- Always emit a
GenericStreamingChunk.choiceslist whose length matches the incomingchoicesarray. Preserve the order so indexes remain aligned with the upstream stream. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md. - For each element, set
GenericStreamingChoice.indexequal to the incomingindex. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md. - Forward the
finish_reason(includingnull) toGenericStreamingChoice.finish_reason. Reference:Response Chunk #105inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md. - Accept non-
stopfinish signals (e.g.,"tool_calls") and propagate them unchanged so downstream logic can detect tool switchovers. Reference:Response Chunk #63inChatCompletions_API_streaming_examples/20251108_222758_22270_RESPONSE_STREAM.md. - Map any
logprobsfield—currentlynullin the traces—toGenericStreamingChoice.logprobsverbatim. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.
- Copy the entire
deltaobject into a freshGenericStreamingDeltastructure, mirroring the keys present in the stream. delta.content→ assign toGenericStreamingDelta.content, concatenating downstream as needed. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.- When a chunk only carries tool-call metadata, providers often emit
""fordelta.content; keep the empty string instead of normalizing it away so chunk ordering stays aligned. Reference:Response Chunk #64inChatCompletions_API_streaming_examples/20251108_222808_70283_RESPONSE_STREAM.md. delta.role→ populateGenericStreamingDelta.role, noting that later chunks often sendnull. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.- Subsequent deltas regularly omit the role (
null); mirror the streamed value inside each chunk instead of injecting the previously observed role. Reference:Response Chunk #0vsResponse Chunk #1inChatCompletions_API_streaming_examples/20251109_125816_01437_RESPONSE_STREAM.md. delta.provider_specific_fields→ carry forward unchanged onto the delta. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.delta.function_call→ forward as-is (the current capture showsnull, but preserve the object structure if present). Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.delta.tool_calls→ preserve the list (even whennull) for later combination with tool streaming logic. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.delta.audio→ forward the value (currentlynull) to the delta’s audio slot so audio-capable providers remain compatible. Reference:Response Chunk #0inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md.
- When
delta.tool_callsis a list of call deltas, map each entry to aGenericStreamingToolCallDeltawhile preserving the incoming ordering. tool_call.id→ copy the identifier (which may benullin a given chunk). Reference:Response Chunk #8inChatCompletions_API_streaming_examples/20251108_222824_10592_RESPONSE_STREAM.md.tool_call.type→ transfer directly (the capture shows"function"; preserve any other provider values). Reference:Response Chunk #8inChatCompletions_API_streaming_examples/20251108_222824_10592_RESPONSE_STREAM.md.tool_call.index→ mirror the numeric slot so downstream tooling can correlate deltas. Reference:Response Chunk #8inChatCompletions_API_streaming_examples/20251108_222824_10592_RESPONSE_STREAM.md.tool_call.function.name→ forward the value (includingnullwhen the provider omits it in a fragment). Reference:Response Chunk #8inChatCompletions_API_streaming_examples/20251108_222824_10592_RESPONSE_STREAM.md.tool_call.function.arguments→ forward the streamed arguments substring exactly as received. Reference:Response Chunk #9inChatCompletions_API_streaming_examples/20251108_222824_10592_RESPONSE_STREAM.md.
usageonly appears on the closing chunks; keepGenericStreamingChunk.usageunset for intermediate emissions and populate it once the payload arrives. Reference:Response Chunk #28inChatCompletions_API_streaming_examples/20251109_125816_01437_RESPONSE_STREAM.md.- Copy the numeric counters (
prompt_tokens,completion_tokens,total_tokens) directly; they already reflect request-level totals. Reference:Response Chunk #35inChatCompletions_API_streaming_examples/20251109_125816_01973_RESPONSE_STREAM.md. - Preserve every nested
*_tokens_detailsblock and cache counter exactly as provided (including zeros andnullvalues) so downstream consumers retain provider-specific accounting. Reference: theusageblock inChatCompletions_API_streaming_examples/20251108_222915_51732_RESPONSE_STREAM.md. - Cached-token metrics can shift between the
cache_creation_*andcache_read_*counters across calls; never normalize these values. Reference:Response Chunk #41inChatCompletions_API_streaming_examples/20251109_131644_45210_RESPONSE_STREAM.md(cache_creation_tokenspopulated) versusResponse Chunk #19inChatCompletions_API_streaming_examples/20251109_131704_44443_RESPONSE_STREAM.md(cache_read_input_tokenspopulated).