Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
f5911d3
Support enhanced JSON Schema features in Google Gemini 2.5+ models
conradlee Nov 6, 2025
083b369
Document discriminator field limitation and add test
conradlee Nov 6, 2025
270c8dd
Fix discriminator test and update with proper type annotations
conradlee Nov 6, 2025
78b174b
Fix tests and docs: Enhanced features only work with Vertex AI
conradlee Nov 6, 2025
46690c5
Create separate transformers for Vertex AI and GLA
conradlee Nov 6, 2025
98c24f4
Merge branch 'main' into feat/google-enhanced-json-schema
conradlee Nov 12, 2025
16871ef
remove verbose documentation of minor update
conradlee Nov 12, 2025
3e07326
Address PR review: Use response_json_schema and simplify implementation
conradlee Nov 12, 2025
2d9ea0d
Remove simplify_nullable_unions - Google supports type: 'null' natively
conradlee Nov 12, 2025
1bfaad9
Add tests for enhanced JSON Schema features and remove enum conversion
conradlee Nov 12, 2025
88be4f8
Test gemini-2.5-flash recursive schemas with Vertex AI (passes)
conradlee Nov 12, 2025
5a1faf7
Remove unnecessary __init__ override in GoogleJsonSchemaTransformer
conradlee Nov 12, 2025
b1d6433
Fix test failures: update snapshots for native JSON Schema support
conradlee Nov 12, 2025
d054438
Fix Vertex AI cassette: correct project ID and content-length
conradlee Nov 12, 2025
3c022cc
Merge branch 'main' into feat/google-enhanced-json-schema
DouweM Nov 12, 2025
dd63c0b
Update pydantic_ai_slim/pydantic_ai/models/google.py
DouweM Nov 12, 2025
36b2e38
Merge branch 'main' into feat/google-enhanced-json-schema
conradlee Nov 13, 2025
02a8231
Address maintainer review: fix comment typo and add prefixItems test
conradlee Nov 13, 2025
db5968a
Add cassette for prefixItems test
conradlee Nov 13, 2025
9254fd5
Remove dead code: simplify_nullable_unions feature
conradlee Nov 13, 2025
c760141
Remove additional dead code: single-member union collapse path
conradlee Nov 13, 2025
c977993
Merge branch 'main' into feat/google-enhanced-json-schema
conradlee Nov 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions docs/models/google.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,71 @@ agent = Agent(model)

`GoogleModel` supports multi-modal input, including documents, images, audio, and video. See the [input documentation](../input.md) for details and examples.

## Enhanced JSON Schema Support

!!! note "Vertex AI Only"
The enhanced JSON Schema features listed below are **only available when using Vertex AI** (`google-vertex:` prefix or `GoogleProvider(vertexai=True)`). They are **not supported** in the Generative Language API (`google-gla:` prefix).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that https://ai.google.dev/gemini-api/docs/structured-output?example=feedback#model_support says we have to use response_json_schema instead of the response_schema key we currently set:

response_schema=response_schema,

response_schema=generation_config.get('response_schema'),

When we do that, maybe it will work for GLA and Vertex?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I've updated the PR. Tests show there's not a difference (with one possible exception -- see the comment below)


As of November 2025, Google Gemini models (2.5+) accessed via **Vertex AI** provide enhanced support for JSON Schema features when using [`NativeOutput`](../output.md#native-output), enabling more sophisticated structured outputs:

### Supported Features

- **Property Ordering**: The order of properties in your Pydantic model definition is now preserved in the output
- **Title Fields**: The `title` field is supported for providing short property descriptions
- **Union Types (`anyOf` and `oneOf`)**: Full support for conditional structures using Python's `Union` or `|` type syntax
- **Recursive Schemas (`$ref` and `$defs`)**: Full support for self-referential models and reusable schema definitions, enabling tree structures and recursive data
- **Numeric Constraints**: `minimum` and `maximum` constraints are respected (note: `exclusiveMinimum` and `exclusiveMaximum` are not yet supported)
- **Optional Fields (`type: 'null'`)**: Proper handling of optional fields with `None` values
- **Additional Properties**: Dictionary fields with `dict[str, T]` are fully supported
- **Tuple Types (`prefixItems`)**: Support for tuple-like array structures

### Example: Recursive Schema

```python {test="skip"}
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.output import NativeOutput

class TreeNode(BaseModel):
"""A tree node that can contain child nodes."""
value: int
children: list['TreeNode'] | None = None

# Use Vertex AI (not GLA) for enhanced schema support
agent = Agent('google-vertex:gemini-2.5-pro', output_type=NativeOutput(TreeNode))

result = await agent.run('Create a tree with root value 1 and two children with values 2 and 3')
# result.output will be a TreeNode with proper structure
```

### Example: Union Types

```python {test="skip"}
from typing import Union, Literal
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.output import NativeOutput

class Success(BaseModel):
status: Literal['success']
data: str

class Error(BaseModel):
status: Literal['error']
error_message: str

class Response(BaseModel):
result: Union[Success, Error]

# Use Vertex AI (not GLA) for enhanced schema support
agent = Agent('google-vertex:gemini-2.5-pro', output_type=NativeOutput(Response))

result = await agent.run('Process this request successfully')
# result.output.result will be either Success or Error
```

See the [structured output documentation](../output.md) for more details on using `NativeOutput` with Pydantic models.

## Model settings

You can customize model behavior using [`GoogleModelSettings`][pydantic_ai.models.google.GoogleModelSettings]:
Expand Down
181 changes: 124 additions & 57 deletions pydantic_ai_slim/pydantic_ai/profiles/google.py
Original file line number Diff line number Diff line change
@@ -1,106 +1,173 @@
from __future__ import annotations as _annotations

import warnings

from pydantic_ai.exceptions import UserError

from .._json_schema import JsonSchema, JsonSchemaTransformer
from . import ModelProfile


def google_model_profile(model_name: str) -> ModelProfile | None:
"""Get the model profile for a Google model."""
"""Get the model profile for a Google model.
Note: This is a generic profile. For Google-specific providers, use:
- google_vertex_model_profile() for Vertex AI (supports enhanced JSON Schema)
- google_gla_model_profile() for Generative Language API (limited JSON Schema)
"""
is_image_model = 'image' in model_name
return ModelProfile(
json_schema_transformer=GoogleJsonSchemaTransformer,
json_schema_transformer=GoogleVertexJsonSchemaTransformer,
supports_image_output=is_image_model,
supports_json_schema_output=not is_image_model,
supports_json_object_output=not is_image_model,
supports_tools=not is_image_model,
)


class GoogleJsonSchemaTransformer(JsonSchemaTransformer):
"""Transforms the JSON Schema from Pydantic to be suitable for Gemini.
def google_vertex_model_profile(model_name: str) -> ModelProfile | None:
"""Get the model profile for a Google Vertex AI model.
Vertex AI supports enhanced JSON Schema features as of November 2025.
"""
is_image_model = 'image' in model_name
return ModelProfile(
json_schema_transformer=GoogleVertexJsonSchemaTransformer,
supports_image_output=is_image_model,
supports_json_schema_output=not is_image_model,
supports_json_object_output=not is_image_model,
supports_tools=not is_image_model,
)

Gemini which [supports](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations)
a subset of OpenAPI v3.0.3.

Specifically:
* gemini doesn't allow the `title` keyword to be set
* gemini doesn't allow `$defs` — we need to inline the definitions where possible
def google_gla_model_profile(model_name: str) -> ModelProfile | None:
"""Get the model profile for a Google Generative Language API model.
GLA has more limited JSON Schema support compared to Vertex AI.
"""
is_image_model = 'image' in model_name
return ModelProfile(
json_schema_transformer=GoogleGLAJsonSchemaTransformer,
supports_image_output=is_image_model,
supports_json_schema_output=not is_image_model,
supports_json_object_output=not is_image_model,
supports_tools=not is_image_model,
)


class GoogleVertexJsonSchemaTransformer(JsonSchemaTransformer):
"""Transforms the JSON Schema from Pydantic to be suitable for Gemini via Vertex AI.
Gemini supports [a subset of OpenAPI v3.0.3](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations).
As of November 2025, Gemini 2.5+ models via Vertex AI support enhanced JSON Schema features
(see [announcement](https://blog.google/technology/developers/gemini-api-structured-outputs/)) including:
* `title` for short property descriptions
* `anyOf` and `oneOf` for conditional structures (unions)
* `$ref` and `$defs` for recursive schemas and reusable definitions
* `minimum` and `maximum` for numeric constraints
* `additionalProperties` for dictionaries
* `type: 'null'` for optional fields
* `prefixItems` for tuple-like arrays
Not supported (empirically tested as of November 2025):
* `exclusiveMinimum` and `exclusiveMaximum` are not yet supported by the Google SDK
* `discriminator` field causes validation errors with nested oneOf schemas
"""

def __init__(self, schema: JsonSchema, *, strict: bool | None = None):
super().__init__(schema, strict=strict, prefer_inlined_defs=True, simplify_nullable_unions=True)
super().__init__(schema, strict=strict, prefer_inlined_defs=False, simplify_nullable_unions=True)

def transform(self, schema: JsonSchema) -> JsonSchema:
# Note: we need to remove `additionalProperties: False` since it is currently mishandled by Gemini
additional_properties = schema.pop(
'additionalProperties', None
) # don't pop yet so it's included in the warning
if additional_properties:
original_schema = {**schema, 'additionalProperties': additional_properties}
warnings.warn(
'`additionalProperties` is not supported by Gemini; it will be removed from the tool JSON schema.'
f' Full schema: {self.schema}\n\n'
f'Source of additionalProperties within the full schema: {original_schema}\n\n'
'If this came from a field with a type like `dict[str, MyType]`, that field will always be empty.\n\n'
"If Google's APIs are updated to support this properly, please create an issue on the Pydantic AI GitHub"
' and we will fix this behavior.',
UserWarning,
)

schema.pop('title', None)
# Remove properties not supported by Gemini
schema.pop('$schema', None)
if (const := schema.pop('const', None)) is not None:
# Gemini doesn't support const, but it does support enum with a single value
schema['enum'] = [const]
schema.pop('discriminator', None)
schema.pop('examples', None)

# TODO: Should we use the trick from pydantic_ai.models.openai._OpenAIJsonSchema
# where we add notes about these properties to the field description?
schema.pop('exclusiveMaximum', None)
schema.pop('exclusiveMinimum', None)

# Gemini only supports string enums, so we need to convert any enum values to strings.
# Pydantic will take care of transforming the transformed string values to the correct type.
if enum := schema.get('enum'):
schema['type'] = 'string'
schema['enum'] = [str(val) for val in enum]

type_ = schema.get('type')
if 'oneOf' in schema and 'type' not in schema: # pragma: no cover
# This gets hit when we have a discriminated union
# Gemini returns an API error in this case even though it says in its error message it shouldn't...
# Changing the oneOf to an anyOf prevents the API error and I think is functionally equivalent
schema['anyOf'] = schema.pop('oneOf')
if type_ == 'string' and (fmt := schema.pop('format', None)):
description = schema.get('description')
if description:
schema['description'] = f'{description} (format: {fmt})'
else:
schema['description'] = f'Format: {fmt}'

# As of November 2025, Gemini 2.5+ models via Vertex AI now support:
# - additionalProperties (for dict types)
# - $ref (for recursive schemas)
# - prefixItems (for tuple-like arrays)
# These are no longer stripped from the schema.

# Note: exclusiveMinimum/exclusiveMaximum are NOT yet supported by Google SDK,
# so we still need to strip them
schema.pop('exclusiveMinimum', None)
schema.pop('exclusiveMaximum', None)

return schema


class GoogleGLAJsonSchemaTransformer(JsonSchemaTransformer):
"""Transforms the JSON Schema from Pydantic to be suitable for Gemini via Generative Language API.
The Generative Language API (google-gla) has MORE LIMITED JSON Schema support compared to Vertex AI.
Notably, GLA does NOT support (as of November 2025):
* `additionalProperties` - causes validation error
* `$ref` and `$defs` - must be inlined
* `prefixItems` - not supported
* `title` - stripped
This transformer applies more aggressive transformations to ensure compatibility with GLA.
"""

def __init__(self, schema: JsonSchema, *, strict: bool | None = None):
# GLA requires $ref inlining
super().__init__(schema, strict=strict, prefer_inlined_defs=True, simplify_nullable_unions=True)

def transform(self, schema: JsonSchema) -> JsonSchema:
# Remove properties not supported by Gemini GLA
schema.pop('$schema', None)
if (const := schema.pop('const', None)) is not None:
# Gemini doesn't support const, but it does support enum with a single value
schema['enum'] = [const]
schema.pop('discriminator', None)
schema.pop('examples', None)

# GLA doesn't support title
schema.pop('title', None)

# Gemini only supports string enums
if enum := schema.get('enum'):
schema['type'] = 'string'
schema['enum'] = [str(val) for val in enum]

type_ = schema.get('type')
if type_ == 'string' and (fmt := schema.pop('format', None)):
description = schema.get('description')
if description:
schema['description'] = f'{description} (format: {fmt})'
else:
schema['description'] = f'Format: {fmt}'

if '$ref' in schema:
raise UserError(f'Recursive `$ref`s in JSON Schema are not supported by Gemini: {schema["$ref"]}')
# GLA does NOT support additionalProperties - must be stripped
if 'additionalProperties' in schema:
schema.pop('additionalProperties')

# GLA does NOT support prefixItems
if 'prefixItems' in schema:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a test yet that verifies that prefixItems now works

Copy link
Author

@conradlee conradlee Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now added a test for this based on a coordinate class whose json schema representation looks like

{
  "description": "A 2D coordinate with latitude and longitude.",
  "properties": {
    "point": {
      "maxItems": 2,
      "minItems": 2,
      "prefixItems": [
        {
          "type": "number"
        },
        {
          "type": "number"
        }
      ],
      "title": "Point",
      "type": "array"
    }
  },
  "required": [
    "point"
  ],
  "title": "Coordinate",
  "type": "object"
}

Luckily this test passes with the google provider.

# prefixItems is not currently supported in Gemini, so we convert it to items for best compatibility
prefix_items = schema.pop('prefixItems')
items = schema.get('items')
unique_items = [items] if items is not None else []
for item in prefix_items:
if item not in unique_items:
unique_items.append(item)
if len(unique_items) > 1: # pragma: no cover
schema['items'] = {'anyOf': unique_items}
elif len(unique_items) == 1: # pragma: no branch
schema['items'] = unique_items[0]
schema.setdefault('minItems', len(prefix_items))
if items is None: # pragma: no branch
schema.setdefault('maxItems', len(prefix_items))
schema.pop('prefixItems')

# Note: exclusiveMinimum/exclusiveMaximum are NOT supported
schema.pop('exclusiveMinimum', None)
schema.pop('exclusiveMaximum', None)

return schema


# Backward compatibility alias
GoogleJsonSchemaTransformer = GoogleVertexJsonSchemaTransformer
4 changes: 2 additions & 2 deletions pydantic_ai_slim/pydantic_ai/providers/google_gla.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from pydantic_ai import ModelProfile
from pydantic_ai.exceptions import UserError
from pydantic_ai.models import cached_async_http_client
from pydantic_ai.profiles.google import google_model_profile
from pydantic_ai.profiles.google import google_gla_model_profile
from pydantic_ai.providers import Provider


Expand All @@ -29,7 +29,7 @@ def client(self) -> httpx.AsyncClient:
return self._client

def model_profile(self, model_name: str) -> ModelProfile | None:
return google_model_profile(model_name)
return google_gla_model_profile(model_name)

def __init__(self, api_key: str | None = None, http_client: httpx.AsyncClient | None = None) -> None:
"""Create a new Google GLA provider.
Expand Down
4 changes: 2 additions & 2 deletions pydantic_ai_slim/pydantic_ai/providers/google_vertex.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from pydantic_ai import ModelProfile
from pydantic_ai.exceptions import UserError
from pydantic_ai.models import cached_async_http_client
from pydantic_ai.profiles.google import google_model_profile
from pydantic_ai.profiles.google import google_vertex_model_profile
from pydantic_ai.providers import Provider

try:
Expand Down Expand Up @@ -53,7 +53,7 @@ def client(self) -> httpx.AsyncClient:
return self._client

def model_profile(self, model_name: str) -> ModelProfile | None:
return google_model_profile(model_name)
return google_vertex_model_profile(model_name)

@overload
def __init__(
Expand Down
23 changes: 14 additions & 9 deletions tests/models/test_gemini.py
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,8 @@ class Locations(BaseModel):


async def test_json_def_recursive(allow_model_requests: None):
"""Test that recursive schemas with $ref are now supported (as of November 2025)."""

class Location(BaseModel):
lat: float
lng: float
Expand Down Expand Up @@ -479,15 +481,18 @@ class Location(BaseModel):
description='This is the tool for the final Result',
parameters_json_schema=json_schema,
)
with pytest.raises(UserError, match=r'Recursive `\$ref`s in JSON Schema are not supported by Gemini'):
mrp = ModelRequestParameters(
function_tools=[],
allow_text_output=True,
output_tools=[output_tool],
output_mode='text',
output_object=None,
)
mrp = m.customize_request_parameters(mrp)
# As of November 2025, Gemini 2.5+ models support recursive $ref in JSON Schema
# This should no longer raise an error
mrp = ModelRequestParameters(
function_tools=[],
allow_text_output=True,
output_tools=[output_tool],
output_mode='text',
output_object=None,
)
mrp = m.customize_request_parameters(mrp)
# Verify the schema still contains $ref after customization
assert '$ref' in mrp.output_tools[0].parameters_json_schema


async def test_json_def_date(allow_model_requests: None):
Expand Down
Loading
Loading