Skip to content

Bug on databricks_langchain/openai streaming when used with thinking.Β #133

@shahrukh-shaik

Description

@shahrukh-shaik

During the message streaming using llm.stream method in langgraph or regular streaming where message chunks has to be merged.
Here is the each message chunk during predict_stream looks like

for i in chat_model.stream("hi"):
    print(i)

Response

content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': 'The', 'signature': ''}]}] additional_kwargs={} response_metadata={} id='run--b1a1d8f7-57af-4d17-bf8f-3dad69e98fe1'
content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': ' message from the user is a simple', 'signature': ''}]}] additional_kwargs={} response_metadata={} id='run--b1a1d8f7-57af-4d17-bf8f-3dad69e98fe1'
content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': ' greeting - "hi". I shoul', 'signature': ''}]}] additional_kwargs={} response_metadata={} id='run--b1a1d8f7-57af-4d17-bf8f-3dad69e98fe1'
content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}]}] additional_kwargs={} response_metadata={} id='run--b1a1d8f7-57af-4d17-bf8f-3dad69e98fe1'
content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': ', introducing myself as Claude and offering to help the', 'signature': ''}]}] additional_kwargs={} response_metadata={} id='run--b1a1d8f7-57af-4d17-bf8f-3dad69e98fe1'
content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': ' user with anything they might need.', 'signature': ''}]}] 

The langchain _stream functions merges ChatGenerationChunk using

from langchain_core.outputs.chat_generation import merge_chat_generation_chunks

you will get final message in a mangled format like below

ChatGenerationChunk(message=AIMessageChunk(content=[{'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}]}, {'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}]}, {'type': 'reasoning', 'summary': [{'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}]}], additional_kwargs={}, response_metadata={}, id='run--4511c99b-da39-46cc-b701-4cc161e52d9b'))

Upon investigation it was found that, if the message stream chunk is in the format below, then the merge will work good as expected.

content=[{'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}, 'index': 0}]
or
content=[{'type': 'text', 'text': "hi", 'index': 1}

Please make changes to include the index':0 to merge all indexes and index:1 for next index. Let me know if you need more info on this issue.

Sample working example which will work :

merge_chat_generation_chunks([
    ChatGenerationChunk(message=AIMessageChunk(content=[{'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}, 'index': 0}], additional_kwargs={}, response_metadata={}, id='run--4511c99b-da39-46cc-b701-4cc161e52d9b')), 
    ChatGenerationChunk(message=AIMessageChunk(content=[{'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}, 'index': 0}], additional_kwargs={}, response_metadata={}, id='run--4511c99b-da39-46cc-b701-4cc161e52d9b')), 
    ChatGenerationChunk(message=AIMessageChunk(content=[{'type': 'reasoning', 'summary': {'type': 'summary_text', 'text': 'd respond in a friendly and welcoming manner', 'signature': ''}, 'index': 0}], additional_kwargs={}, response_metadata={}, id='run--4511c99b-da39-46cc-b701-4cc161e52d9b')),
    ChatGenerationChunk(message=AIMessageChunk(content=[{'type': 'text', 'text': "hi", 'index': 1}], additional_kwargs={}, response_metadata={}, id='run--4511c99b-da39-46cc-b701-4cc161e52d9b'))])

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions