Skip to content

[langchain-openai] with_structured_output() silently drops previously bound tools and lacks support for OpenAI native tool bindingsΒ #35320

@ardakdemir

Description

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

Related issue: #28848

This is related to #28848 (bind_tools not callable after with_structured_output), which describes the reverse ordering problem. This issue covers a different but connected scenario: when .bind(tools=...) is called before .with_structured_output(), the tools are silently dropped from the API request with no error or warning. Together, these issues show that with_structured_output() and tool bindings are fundamentally incompatible in both directions β€” and neither direction gives the user a clear signal.

Reproduction Steps / Example Code (Python)

from langchain_openai import ChatOpenAI
from langchain_core.utils.function_calling import convert_to_openai_function
from pydantic import BaseModel, Field

class WeatherResponse(BaseModel):
    temperature: float = Field(description="The temperature in fahrenheit")
    condition: str = Field(description="The weather condition")

llm = ChatOpenAI(model="gpt-5-mini")

# ----------------------------------------------------------------
# BUG: .with_structured_output() silently drops tools from .bind()
# ----------------------------------------------------------------

# Step 1 β€” Bind an OpenAI native tool (e.g. web_search)
llm_with_tools = llm.bind(tools=[{"type": "web_search"}])

# Step 2 β€” Apply structured output on top
chain = llm_with_tools.with_structured_output(WeatherResponse)

# Step 3 β€” Invoke
result = chain.invoke("What is the weather in San Francisco right now?")
print(result)
# => temperature=57.0
#    condition='Estimated typical current weather: cool and often foggy.
#    I can't access live weather data β€” ...'
#
# The model HALLUCINATED the weather because web_search was silently dropped.
# No error, no warning β€” it just disappears.

# ----------------------------------------------------------------
# WORKAROUND: bypass with_structured_output(), use a single .bind()
# ----------------------------------------------------------------

# Users must manually replicate LangChain's internal schema conversion
# to build the response_format dict β€” no Pydantic convenience here.
function = convert_to_openai_function(WeatherResponse, strict=True)
function["schema"] = function.pop("parameters")
response_format = {"type": "json_schema", "json_schema": function}

llm_fixed = llm.bind(
    tools=[{"type": "web_search"}],
    tool_choice="auto",
    response_format=response_format,
)

result = llm_fixed.invoke("What is the weather in San Francisco right now?")
print(result)
# => content=[..., {'type': 'web_search_call', 'status': 'completed', ...},
#    ..., {'type': 'text', 'text': '{"temperature":52,"condition":"Mostly cloudy ..."}'}]
#
# The model correctly performed a web search and returned real data.

Error Message and Stack Trace (if applicable)

**There is no error.** That is the core problem β€” this fails **completely silently**.

No exception is raised, no warning is logged, no deprecation notice is emitted. The `tools` parameter is simply absent from the HTTP request payload sent to OpenAI. This was confirmed by inspecting raw HTTP requests via `httpx` event hooks.

The model returns a plausible-looking response (hallucinated data), making it very difficult to notice that tools were dropped.

Description

Part 1: Bug β€” Silent dropping of all tool bindings

Calling .bind(tools=[...]) followed by .with_structured_output(schema) on a ChatOpenAI instance causes the tool bindings to be silently discarded.

Root cause: with_structured_output() internally creates a new RunnableSequence that configures its own model bindings to enforce the output schema (via response_format or tool_choice). These new bindings do not merge with the previously bound kwargs from .bind(). The tools kwarg is effectively overwritten.

This silent drop affects all tool types β€” both OpenAI native tools ({"type": "web_search"}) and custom LangChain tools (callables passed via bind_tools()). Any kwargs set via .bind() before calling .with_structured_output() are lost.

I verified this by injecting httpx event hooks to log the raw HTTP request body. The logs confirmed:

Scenario tools in payload response_format in payload
.bind(tools=...) then .with_structured_output(...) Missing Present
.bind(tools=..., response_format=...) (workaround) Present Present

Part 2: Feature gap β€” No Pydantic-friendly path for native tools + structured output

The OpenAI API fully supports sending both tools and response_format in the same request β€” these are independent, composable features. This is especially important for OpenAI's native tools (web_search, code_interpreter, file_search), which are designed to work alongside structured output: the model uses the tool internally and returns results in the structured format.

However, there is currently no way to combine a Pydantic class (via with_structured_output()) with native tools. The only workaround is to:

  1. Manually convert the Pydantic class to an OpenAI-compatible JSON schema dict (replicating LangChain's internal _convert_to_openai_response_format logic)
  2. Pass both tools and response_format in a single .bind() call

This defeats the purpose of the with_structured_output() abstraction and forces users to work at a lower level of the API.

Why This Is a Problem

  1. Silent failures are dangerous. The code appears to work β€” it returns structured JSON β€” but the model is hallucinating instead of using the tool. In a production system this produces incorrect results with no signal that something is wrong. Libraries should fail fast and loud, not silently swallow configuration.

  2. No Pydantic path for a common use case. Combining native tools with structured output is a mainstream OpenAI pattern (e.g., "search the web and return results as this Pydantic schema"). Today, users cannot use with_structured_output(MyPydanticModel) for this β€” they must manually build the response_format dict, which is error-prone and bypasses LangChain's schema validation and strict-mode handling.

  3. The workaround is non-trivial. Users must replicate LangChain's internal _convert_to_openai_response_format logic (using convert_to_openai_function, renaming parameters β†’ schema, wrapping in the json_schema envelope). This is fragile and may break across LangChain versions.

Suggested Fix

Option A β€” Preserve existing bindings (minimal fix for the bug):

with_structured_output() should read the current self.kwargs before creating new bindings and merge them, so previously bound tools are not lost:

# Inside with_structured_output(), before creating the new chain:
existing_kwargs = getattr(self, 'kwargs', {})
# Merge tools, tool_choice, etc. from existing_kwargs into the new binding

Option B β€” Accept a tools parameter directly (fixes both bug and feature gap):

Allow with_structured_output() to accept tools alongside the Pydantic schema, so users can combine them in a single call:

# Allow users to specify tools and structured output together:
llm.with_structured_output(
    WeatherResponse,
    tools=[{"type": "web_search"}],
    tool_choice="auto",
)

This would provide the Pydantic convenience that users expect, while correctly sending both tools and response_format to the OpenAI API.

Option C β€” At minimum, warn or raise (fixes the silent failure):

If supporting both is not feasible, detect when previously bound kwargs would be lost and raise an explicit error:

if self.kwargs.get("tools"):
    raise ValueError(
        "with_structured_output() does not preserve previously bound tools. "
        "Use llm.bind(tools=..., response_format=...) instead to combine "
        "tools with structured output."
    )

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 25.2.0: Tue Nov 18 21:09:41 PST 2025; ---
Python Version: 3.13.11 (main, Dec 5 2025, 16:06:33)---

Package Information

langchain_core: 1.2.8
langsmith: 0.4.60
langchain_openai: 1.1.7
langgraph_sdk: 0.3.3

Optional packages not installed

langserve

Other Dependencies

httpx: 0.28.1
jsonpatch: 1.33
openai: 2.16.0
opentelemetry-api: 1.39.1
opentelemetry-sdk: 1.39.1
orjson: 3.11.7
packaging: 25.0
pydantic: 2.12.5
pytest: 8.4.2
pyyaml: 6.0.3
requests: 2.32.5
requests-toolbelt: 1.0.0
rich: 14.3.2
tenacity: 9.1.2
tiktoken: 0.12.0
typing-extensions: 4.15.0
uuid-utils: 0.14.0
zstandard: 0.25.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugRelated to a bug, vulnerability, unexpected error with an existing featureexternalopenai`langchain-openai` package issues & PRs

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions