Skip to content

interopZodTransformInputSchema does not retain reuse of subschema references #9307

@ro0sterjam

Description

@ro0sterjam

Checked other resources

  • This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

When calling interopZodTransformInputSchema, sub schema references are not retained, which when calling toJSONSchema from zod results in the schema refs built on the wrong level. And in some cases results in empty properties.

import { interopZodTransformInputSchema } from "@langchain/core/utils/types";
import { z } from "zod/v4";
import { toJSONSchema } from "zod/v4/core";

const addressSchema = z.object({
  street: z.string().describe("The street of the address."),
  city: z.string().describe("The city of the address."),
});

const testSchema = z.object({
  primary: addressSchema,
  secondary: addressSchema,
});

const withoutSanitization = toJSONSchema(testSchema, {
  cycles: "ref",
  reused: "ref",
  override(ctx) {
    ctx.jsonSchema.title = "extract";
  },
});

const withSanitization = toJSONSchema(interopZodTransformInputSchema(testSchema, true), {
  cycles: "ref",
  reused: "ref",
  override(ctx) {
    ctx.jsonSchema.title = "extract";
  },
});

console.log("Without sanitization:");
console.log(JSON.stringify(withoutSanitization, null, 2));
console.log("\n");
console.log("With sanitization:");
console.log(JSON.stringify(withSanitization, null, 2));

This results in:

Without sanitization:
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "primary": {
      "$ref": "#/$defs/__schema0"
    },
    "secondary": {
      "$ref": "#/$defs/__schema0"
    }
  },
  "required": [
    "primary",
    "secondary"
  ],
  "additionalProperties": false,
  "title": "extract",
  "$defs": {
    "__schema0": {
      "type": "object",
      "properties": {
        "street": {
          "description": "The street of the address.",
          "type": "string",
          "title": "extract"
        },
        "city": {
          "description": "The city of the address.",
          "type": "string",
          "title": "extract"
        }
      },
      "required": [
        "street",
        "city"
      ],
      "additionalProperties": false,
      "title": "extract"
    }
  }
}


With sanitization:
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "primary": {
      "type": "object",
      "properties": {
        "street": {
          "$ref": "#/$defs/__schema0"
        },
        "city": {
          "$ref": "#/$defs/__schema1"
        }
      },
      "required": [
        "street",
        "city"
      ],
      "additionalProperties": false,
      "title": "extract"
    },
    "secondary": {
      "type": "object",
      "properties": {
        "street": {
          "$ref": "#/$defs/__schema0"
        },
        "city": {
          "$ref": "#/$defs/__schema1"
        }
      },
      "required": [
        "street",
        "city"
      ],
      "additionalProperties": false,
      "title": "extract"
    }
  },
  "required": [
    "primary",
    "secondary"
  ],
  "additionalProperties": false,
  "title": "extract",
  "$defs": {
    "__schema0": {
      "description": "The street of the address.",
      "type": "string",
      "title": "extract"
    },
    "__schema1": {
      "description": "The city of the address.",
      "type": "string",
      "title": "extract"
    }
  }
}

Error Message and Stack Trace (if applicable)

No response

Description

When using openai, at some point when calling withStructuredOutput there is a call to:

toJSONSchemaV4(zodSchema, {
                    cycles: "ref", // equivalent to nameStrategy: 'duplicate-ref'
                    reused: "ref", // equivalent to $refStrategy: 'extract-to-root'
                    override(ctx) {
                        ctx.jsonSchema.title = name; // equivalent to `name` property
                        // TODO: implement `nullableStrategy` patch-fix (zod doesn't support openApi3 json schema target)
                        // TODO: implement `openaiStrictMode` patch-fix (where optional properties without `nullable` are not supported)
                    },
                    /// property equivalents from native `zodResponseFormat` fn
                    // openaiStrictMode: true,
                    // name,
                    // nameStrategy: 'duplicate-ref',
                    // $refStrategy: 'extract-to-root',
                    // nullableStrategy: 'property',
                })

Where at this point the schema passed in has already run through interopZodTransformInputSchema.

But because interopZodTransformInputSchema does not respect the reusing of references to the subschemas, the toJSONSchema method from zod ends up building a JSONSchema that differs from what it originally should have been.

In the example above, the zod schema:

const addressSchema = z.object({
  street: z.string().describe("The street of the address."),
  city: z.string().describe("The city of the address."),
});

const testSchema = z.object({
  primary: addressSchema,
  secondary: addressSchema,
});

Should have resulted in a JSONSchema that generates a def for the addressSchema sub schema, and used that ref for primary and secondary:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "primary": {
      "$ref": "#/$defs/__schema0"
    },
    "secondary": {
      "$ref": "#/$defs/__schema0"
    }
  },
  "required": [
    "primary",
    "secondary"
  ],
  "additionalProperties": false,
  "title": "extract",
  "$defs": {
    "__schema0": {
      "type": "object",
      "properties": {
        "street": {
          "description": "The street of the address.",
          "type": "string",
          "title": "extract"
        },
        "city": {
          "description": "The city of the address.",
          "type": "string",
          "title": "extract"
        }
      },
      "required": [
        "street",
        "city"
      ],
      "additionalProperties": false,
      "title": "extract"
    }
  }
}

But instead it creates a def for and reuses the properties inside addressSchema (i.e. street and city) instead:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "primary": {
      "type": "object",
      "properties": {
        "street": {
          "$ref": "#/$defs/__schema0"
        },
        "city": {
          "$ref": "#/$defs/__schema1"
        }
      },
      "required": [
        "street",
        "city"
      ],
      "additionalProperties": false,
      "title": "extract"
    },
    "secondary": {
      "type": "object",
      "properties": {
        "street": {
          "$ref": "#/$defs/__schema0"
        },
        "city": {
          "$ref": "#/$defs/__schema1"
        }
      },
      "required": [
        "street",
        "city"
      ],
      "additionalProperties": false,
      "title": "extract"
    }
  },
  "required": [
    "primary",
    "secondary"
  ],
  "additionalProperties": false,
  "title": "extract",
  "$defs": {
    "__schema0": {
      "description": "The street of the address.",
      "type": "string",
      "title": "extract"
    },
    "__schema1": {
      "description": "The city of the address.",
      "type": "string",
      "title": "extract"
    }
  }
}

This is due to interopZodTransformInputSchema not caching the sanitized subschemas, so when it encounters secondary, instead of reusing the already sanitized addressSchema from primary, it recreates another sanitized copy of addressSchema.

At best this results in a less efficient JSONSchema. At worst, the generated JSONSchema becomes invalid (some properties end up completely empty).

System Info

Langchain 0.3.35
Node v24.5.0
OSX 15.6.1 (24G90)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions