-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Description
Component
_build_pydantic_model_from_json_schema()
Summary:
The current implementation of _build_pydantic_model_from_json_schema() does not support oneOf with discriminator, which is a standard OpenAPI / JSON Schema pattern for polymorphic objects.
When schemas using this pattern are provided, the function silently degrades the schema by resolving polymorphic structures as loosely typed (str / dict). This does not raise errors, but it loses schema intent and type safety.
Why This Matters
oneOf + discriminator is the canonical OpenAPI mechanism for modeling polymorphic request payloads, such as action-based APIs and workflow engines.
These schemas are:
- Valid JSON Schema
- Commonly generated by OpenAPI tools
- Expected to round-trip into typed models
Example Input Schema:
"$defs": {
"CreateProject": {
"description": "Action: Create an Azure DevOps project.",
"properties": {
"name": {
"const": "create_project",
"default": "create_project",
"type": "string"
},
"params": {
"$ref": "#/$defs/CreateProjectParams"
}
},
"required": [
"params"
],
"type": "object"
},
"CreateProjectParams": {
"description": "Parameters for the create_project action.",
"properties": {
"projectName": {
"minLength": 1,
"type": "string"
},
"description": {
"default": "",
"type": "string"
},
"template": {
"default": "Agile",
"type": "string"
},
"sourceControl": {
"default": "Git",
"enum": [
"Git",
"Tfvc"
],
"type": "string"
},
"visibility": {
"default": "private",
"type": "string"
}
},
"required": [
"orgUrl",
"projectName"
],
"type": "object"
},
"DeployRequest": {
"description": "Request to deploy Azure DevOps resources.",
"properties": {
"projectName": {
"minLength": 1,
"type": "string"
},
"organization": {
"minLength": 1,
"type": "string"
},
"actions": {
"items": {
"discriminator": {
"mapping": {
"create_project": "#/$defs/CreateProject",
"hello_world": "#/$defs/HelloWorld"
},
"propertyName": "name"
},
"oneOf": [
{
"$ref": "#/$defs/HelloWorld"
},
{
"$ref": "#/$defs/CreateProject"
}
]
},
"type": "array"
}
},
"required": [
"projectName",
"organization"
],
"type": "object"
},
"HelloWorld": {
"description": "Action: Prints a greeting message.",
"properties": {
"name": {
"const": "hello_world",
"default": "hello_world",
"type": "string"
},
"params": {
"$ref": "#/$defs/HelloWorldParams"
}
},
"required": [
"params"
],
"type": "object"
},
"HelloWorldParams": {
"description": "Parameters for the hello_world action.",
"properties": {
"name": {
"description": "Name to greet",
"minLength": 1,
"type": "string"
}
},
"required": [
"name"
],
"type": "object"
},
"properties": {
"params": {
"$ref": "#/$defs/DeployRequest"
}
},
"required": [
"params"
],
"type": "object"
}
}
With the schema above, the current func implementation:
- Ignores oneOf
- Ignores discriminator
- Ignores const
- Resolves actions as a loosely typed collection
Resulting model shape (effectively):
class DeployRequest(BaseModel):
projectName: str
organization: str
actions: list[str] | list[dict]
While intended is:
from typing import Annotated, Union, Literal
from pydantic import BaseModel, Field
class HelloWorld(BaseModel):
name: Literal["hello_world"]
params: HelloWorldParams
class CreateProject(BaseModel):
name: Literal["create_project"]
params: CreateProjectParams
Action = Annotated[
Union[HelloWorld, CreateProject],
Field(discriminator="name")
]
class DeployRequest(BaseModel):
projectName: str
organization: str
actions: list[Action]
I fixed it locally and was able to generate the correct schema for the AI function. Sharing code sample with bug fix below for reference. Let me know can also publish same on a branch.
Code Sample
def _build_pydantic_model_from_json_schema(
model_name: str,
schema: Mapping[str, Any],
) -> type[BaseModel]:
"""Creates a Pydantic model from JSON Schema with support for $refs, nested objects, and typed arrays.
Args:
model_name: The name of the model to be created.
schema: The JSON Schema definition (should contain 'properties', 'required', '$defs', etc.).
Returns:
The dynamically created Pydantic model class.
"""
properties = schema.get("properties")
required = schema.get("required", [])
definitions = schema.get("$defs", {})
# Check if 'properties' is missing or not a dictionary
if not properties:
return create_model(f"{model_name}_input")
## Bug fix
def _resolve_literal_type(prop_details: dict[str, Any]) -> type | None:
# const → Literal["value"]
if "const" in prop_details:
return Literal[prop_details["const"]] # type: ignore
# enum → Literal["a", "b", ...]
if "enum" in prop_details and isinstance(prop_details["enum"], list):
enum_values = prop_details["enum"]
if enum_values:
return Literal[tuple(enum_values)] # type: ignore
return None
def _resolve_type(prop_details: dict[str, Any], parent_name: str = "") -> type:
"""Resolve JSON Schema type to Python type, handling $ref, nested objects, and typed arrays.
Args:
prop_details: The JSON Schema property details
parent_name: Name to use for creating nested models (for uniqueness)
Returns:
Python type annotation (could be int, str, list[str], or a nested Pydantic model)
"""
# Handle oneOf + discriminator (polymorphic objects) ---> Bug fix
if "oneOf" in prop_details and "discriminator" in prop_details:
discriminator = prop_details["discriminator"]
disc_field = discriminator.get("propertyName")
variants = []
for variant in prop_details["oneOf"]:
if "$ref" in variant:
ref = variant["$ref"]
if ref.startswith("#/$defs/"):
def_name = ref.split("/")[-1]
resolved = definitions.get(def_name)
if resolved:
variant_model = _resolve_type(
resolved,
parent_name=f"{parent_name}_{def_name}"
)
variants.append(variant_model)
if variants and disc_field:
return Annotated[
Union[tuple(variants)], # type: ignore
Field(discriminator=disc_field)
]
# Handle $ref by resolving the reference
if "$ref" in prop_details:
ref = prop_details["$ref"]
# Extract the reference path (e.g., "#/$defs/CustomerIdParam" -> "CustomerIdParam")
if ref.startswith("#/$defs/"):
def_name = ref.split("/")[-1]
if def_name in definitions:
# Resolve the reference and use its type
resolved = definitions[def_name]
return _resolve_type(resolved, def_name)
# If we can't resolve the ref, default to dict for safety
return dict
# Map JSON Schema types to Python types
json_type = prop_details.get("type", "string")
match json_type:
case "integer":
return int
case "number":
return float
case "boolean":
return bool
case "array":
# Handle typed arrays
items_schema = prop_details.get("items")
if items_schema and isinstance(items_schema, dict):
# Recursively resolve the item type
item_type = _resolve_type(items_schema, f"{parent_name}_item")
# Return list[ItemType] instead of bare list
return list[item_type] # type: ignore
# If no items schema or invalid, return bare list
return list
case "object":
# Handle nested objects by creating a nested Pydantic model
nested_properties = prop_details.get("properties")
nested_required = prop_details.get("required", [])
if nested_properties and isinstance(nested_properties, dict):
# Create the name for the nested model
nested_model_name = f"{parent_name}_nested" if parent_name else "NestedModel"
# Recursively build field definitions for the nested model
nested_field_definitions: dict[str, Any] = {}
for nested_prop_name, nested_prop_details in nested_properties.items():
nested_prop_details = (
json.loads(nested_prop_details)
if isinstance(nested_prop_details, str)
else nested_prop_details
)
### Bug fix
# nested_python_type = _resolve_type(
# nested_prop_details, f"{nested_model_name}_{nested_prop_name}"
# )
literal_type = _resolve_literal_type(nested_prop_details)
if literal_type is not None:
nested_python_type = literal_type
else:
nested_python_type = _resolve_type(
nested_prop_details,
f"{nested_model_name}_{nested_prop_name}"
)
nested_description = nested_prop_details.get("description", "")
# Build field kwargs for nested property
nested_field_kwargs: dict[str, Any] = {}
if nested_description:
nested_field_kwargs["description"] = nested_description
# Create field definition
if nested_prop_name in nested_required:
nested_field_definitions[nested_prop_name] = (
(
nested_python_type,
Field(**nested_field_kwargs),
)
if nested_field_kwargs
else (nested_python_type, ...)
)
else:
nested_field_kwargs["default"] = nested_prop_details.get("default", None)
nested_field_definitions[nested_prop_name] = (
nested_python_type,
Field(**nested_field_kwargs),
)
# Create and return the nested Pydantic model
return create_model(nested_model_name, **nested_field_definitions) # type: ignore
# If no properties defined, return bare dict
return dict
case _:
return str # default
field_definitions: dict[str, Any] = {}
for prop_name, prop_details in properties.items():
prop_details = json.loads(prop_details) if isinstance(prop_details, str) else prop_details
# python_type = _resolve_type(prop_details, f"{model_name}_{prop_name}") ---> Bug fix
literal_type = _resolve_literal_type(prop_details)
if literal_type is not None:
python_type = literal_type
else:
python_type = _resolve_type(prop_details, f"{model_name}_{prop_name}")
description = prop_details.get("description", "")
# Build field kwargs (description, etc.)
field_kwargs: dict[str, Any] = {}
if description:
field_kwargs["description"] = description
# Create field definition for create_model
if prop_name in required:
if field_kwargs:
field_definitions[prop_name] = (python_type, Field(**field_kwargs))
else:
field_definitions[prop_name] = (python_type, ...)
else:
default_value = prop_details.get("default", None)
field_kwargs["default"] = default_value
if field_kwargs and any(k != "default" for k in field_kwargs):
field_definitions[prop_name] = (python_type, Field(**field_kwargs))
else:
field_definitions[prop_name] = (python_type, default_value)
return create_model(f"{model_name}_input", **field_definitions)
Error Messages / Stack Traces
Package Versions
agent-framework==1.0.0b260106
Python Version
No response
Additional Context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status