Skip to content

Field-specific description in nested Pydantic models is lost when using with_structured_output()Β #32483

@leonardozilli

Description

@leonardozilli

Checked other resources

  • This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Example Code

from langchain_core.utils.function_calling import convert_to_openai_tool
from pydantic import BaseModel, Field
from typing import Optional

class FinancialEntry(BaseModel):
    """A single financial entry with an amount and currency."""
    amount: float = Field(description="The financial value of the entry, in Euros.")
    currency: str = Field(
        description="The currency of the financial entry, default is Euro (EUR).",
    )

class FinancialReport(BaseModel):
    """Financial report containing various financial entries."""
    report_id: str = Field(description="βœ… The unique identifier for the report.")
    total_revenue: FinancialEntry = Field(
        description="❌ The total revenue entry. This description is replaced."
    )

tool_schema = convert_to_openai_tool(FinancialReport)

print(tool_schema)

Error Message and Stack Trace (if applicable)

n.a.

Description

When using with_structured_output() with a Pydantic Model that includes nested classes, the internal schema dereferencing logic uses the generic class description (from the nested class' docstring) instead of the specific field description (from Field(description="...")).

The schema from the example above is converted to:

{'type': 'function',
 'function': {'name': 'FinancialReport',
  'description': 'Financial report containing various financial entries.',
  'parameters': {'properties': {'report_id': {'description': 'βœ… The unique identifier for the report.',
     'type': 'string'},
    'total_revenue': {'description': 'A single financial entry with an amount and currency.',
     'properties': {'amount': {'anyOf': [{'type': 'number'}, {'type': 'null'}],
       'default': None,
       'description': 'The financial value of the entry, in Euros.'},
      'currency': {'anyOf': [{'type': 'string'}, {'type': 'null'}],
       'default': None,
       'description': 'The currency of the financial entry, default is Euro (EUR).'}},
     'type': 'object'}},
   'required': ['report_id', 'total_revenue'],
   'type': 'object'}}}

where I would expect:

{'type': 'function',
 'function': {'name': 'FinancialReport',
  'description': 'Financial report containing various financial entries.',
  'parameters': {'properties': {'report_id': {'description': 'βœ… The unique identifier for the report.',
     'type': 'string'},
    'total_revenue': {'description': '❌ The total revenue entry. This description is replaced.',
     'properties': {'amount': {'anyOf': [{'type': 'number'}, {'type': 'null'}],
       'default': None,
       'description': 'The financial value of the entry, in Euros.'},
      'currency': {'anyOf': [{'type': 'string'}, {'type': 'null'}],
       'default': None,
       'description': 'The currency of the financial entry, default is Euro (EUR).'}},
     'type': 'object'}},
   'required': ['report_id', 'total_revenue'],
   'type': 'object'}}}

If the nested model has no docstring, no description is added to the schema.

This doesn't seem to happen when using pydantic v1 (i.e. importing from langchain_core.pydantic_v1 import BaseModel, Field instead of from pydantic import Field, BaseModel as suggested in #21270).

System Info

System Information

OS: Linux
OS Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 5 18:30:46 UTC 2025
Python Version: 3.12.3 (main, Jun 18 2025, 17:59:45) [GCC 13.3.0]

Package Information

langchain_core: 0.3.74
langsmith: 0.4.13
langchain_google_genai: 2.1.9

Optional packages not installed

langserve

Other Dependencies

filetype: 1.2.0
google-ai-generativelanguage: 0.6.18
httpx<1,>=0.23.0: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langsmith-pyo3>=0.1.0rc2;: Installed. No version info available.
langsmith>=0.3.45: Installed. No version info available.
openai-agents>=0.0.3;: Installed. No version info available.
opentelemetry-api>=1.30.0;: Installed. No version info available.
opentelemetry-exporter-otlp-proto-http>=1.30.0;: Installed. No version info available.
opentelemetry-sdk>=1.30.0;: Installed. No version info available.
orjson>=3.9.14;: Installed. No version info available.
packaging>=23.2: Installed. No version info available.
pydantic: 2.11.7
pydantic<3,>=1: Installed. No version info available.
pydantic>=2.7.4: Installed. No version info available.
pytest>=7.0.0;: Installed. No version info available.
PyYAML>=5.3: Installed. No version info available.
requests-toolbelt>=1.0.0: Installed. No version info available.
requests>=2.0.0: Installed. No version info available.
rich>=13.9.4;: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
typing-extensions>=4.7: Installed. No version info available.
vcrpy>=7.0.0;: Installed. No version info available.
zstandard>=0.23.0: Installed. No version info available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugRelated to a bug, vulnerability, unexpected error with an existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions