openai[patch]: fix getting generation info when streaming with json_schema response format on Azure #31062

jjurm · 2025-04-29T00:43:59Z

Problem

When using Azure's chat completions API, with json_schema response format and streaming, the LLMResult response passed to callbacks doesn't contain model_name, just an empty string. (The support for streaming requests with json_schema response format was added in #29044.)

The problem can be reproduced with the following minimal example (with env variables as described here).

from langchain_core.callbacks.base import BaseCallbackHandler
from langchain_openai import AzureChatOpenAI
from pydantic import BaseModel

class CustomCallbackHandler(BaseCallbackHandler):
    def on_llm_end(self, response, **kwargs):
        model_name = response.generations[0][0].generation_info["model_name"]
        print(f"{model_name=}")

llm = AzureChatOpenAI(
    azure_deployment="gpt-4o-2024-08-06",
    callbacks=[CustomCallbackHandler()],
)

class Answer(BaseModel):
    joke: str

response = llm.stream("Tell me a joke", response_format=Answer)
list(response)

Observed output:

model_name=''

Expected output:

model_name='gpt-4o-2024-08-06'

The output is as expected when changing AzureChatOpenAI to ChatOpenAI (and the corresponding OpenAI endpoint), and also when function_calling method for structured outputs is used instead of the default json_schema.

Package versions used:

langchain=0.3.24
langchain-openai=0.3.14
langchain-community=0.3.23

Solution

This PR fixes the problem.

Similar approach is already used a couple lines above.

vercel · 2025-04-29T00:44:03Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		May 1, 2025 7:31am

jjurm · 2025-04-29T14:51:43Z

cc @ccurme this PR might be most relevant to you since you added the related #29044

ccurme

Hi @jjurm, I do see model_name in the response_metadata when I run your example. On your branch I actually see it doubled-up:

model_name='gpt-4o-mini-2024-07-18gpt-4o-mini-2024-07-18'

so we cannot merge this as-is.

When passing response_format and streaming you should be entering this condition, which is where model_name is assigned. Can you use that to debug?

langchain/libs/partners/openai/langchain_openai/chat_models/base.py

Lines 902 to 911 in 6268ae8

    
           if hasattr(response, "get_final_completion") and "response_format" in payload: 
        
               final_completion = response.get_final_completion() 
        
               generation_chunk = self._get_generation_chunk_from_completion( 
        
                   final_completion 
        
               ) 
        
               if run_manager: 
        
                   run_manager.on_llm_new_token( 
        
                       generation_chunk.text, chunk=generation_chunk 
        
                   ) 
        
               yield generation_chunk

jjurm · 2025-05-05T09:08:30Z

@ccurme thanks a lot for the pointers!
tldr: solved by openai/openai-python#2255.

In my case, when the if condition entered, response.current_completion_snapshot.model and response.get_final_completion().model were both '', hinting it's an openai issue. The only place .model gets value is in _convert_initial_chunk_into_snapshot; in my case, the initial chunk is empty, as described in openai/openai-python#2255.

Updating to openai>=1.74.0 solved the issue.

dosubot bot added size:XS bug Related to a bug, vulnerability, unexpected error with an existing feature labels Apr 29, 2025

jjurm force-pushed the jm/oai_structured_stream_model_name branch from 37bc156 to d05d09f Compare April 29, 2025 14:44

fix getting model_name when streaming with json_schema response format

d069e0f

jjurm force-pushed the jm/oai_structured_stream_model_name branch from d05d09f to d069e0f Compare May 1, 2025 07:30

ccurme reviewed May 3, 2025

View reviewed changes

ccurme self-assigned this May 3, 2025

jjurm closed this May 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

openai[patch]: fix getting generation info when streaming with json_schema response format on Azure #31062

openai[patch]: fix getting generation info when streaming with json_schema response format on Azure #31062

Uh oh!

jjurm commented Apr 29, 2025

Uh oh!

vercel bot commented Apr 29, 2025 •

edited

Loading

Uh oh!

jjurm commented Apr 29, 2025

Uh oh!

ccurme left a comment

Uh oh!

jjurm commented May 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if hasattr(response, "get_final_completion") and "response_format" in payload:
	final_completion = response.get_final_completion()
	generation_chunk = self._get_generation_chunk_from_completion(
	final_completion
	)
	if run_manager:
	run_manager.on_llm_new_token(
	generation_chunk.text, chunk=generation_chunk
	)
	yield generation_chunk

openai[patch]: fix getting generation info when streaming with json_schema response format on Azure #31062

openai[patch]: fix getting generation info when streaming with json_schema response format on Azure #31062

Uh oh!

Conversation

jjurm commented Apr 29, 2025

Problem

Solution

Uh oh!

vercel bot commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjurm commented Apr 29, 2025

Uh oh!

ccurme left a comment

Choose a reason for hiding this comment

Uh oh!

jjurm commented May 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Apr 29, 2025 •

edited

Loading