Skip to content

Conversation

@fede-kamel
Copy link
Contributor

@fede-kamel fede-kamel commented Jan 29, 2026

Bug

convert_to_oci_tool strips JSON Schema constraints (enum, min/max, pattern, format) from tool definitions before sending them to the model. A tool defined with enum: ["cpu", "memory", "disk", "network"] gets reduced to just type: string — the model never sees the allowed values.

Additionally, Optional fields (Pydantic anyOf pattern) get converted to type: "any" which is not valid JSON Schema — this causes Gemini models to return 400 errors.

Root cause: the old code uses tool.args which goes through Pydantic's tool_call_schema re-generation, then only keeps type and description:

# old code
p_name: {
    "type": p_def.get("type", "any"),
    "description": p_def.get("description", ""),
}

Reproduction

Tool with enum: ["cpu", "memory", "disk", "network"], prompt: "Check the latency metrics" (intentionally not in enum).

main branch (broken) — enum stripped, models hallucinate or crash:

meta.llama-3.3-70b-instruct:    metric_name = "latency"  (not a valid value)
google.gemini-2.5-flash:        metric_name = "latency"  (not a valid value)
google.gemini-2.5-pro:          metric_name = "latency"  (not a valid value)
google.gemini-2.5-flash-lite:   metric_name = "latency"  (not a valid value)

fix branch — enum preserved, models respect constraints:

meta.llama-3.3-70b-instruct:    metric_name = "network"  (closest valid value)
google.gemini-2.5-flash:        refused — "can only query cpu, memory, disk, network"
google.gemini-2.5-pro:          refused — "available metrics are: cpu, memory, disk, network"
google.gemini-2.5-flash-lite:   refused — "can only query cpu, memory, disk, network"

MCP tool reproduction

Same test with a FastMCP server (MetricName enum + duration_hours with ge=1, le=168).

main branch — MCP tools broken on Gemini, hallucinated on Llama:

What main sends to OCI:
  metric_name:    {"type": "string", "description": ""}     ← enum stripped
  duration_hours: {"type": "any", "description": "..."}     ← anyOf becomes "any"

meta.llama-3.3-70b:      metric_name = "latency"  (hallucinated)
google.gemini-2.5-flash: 400 error  (type "any" rejected by Gemini)
google.gemini-2.5-pro:   400 error  (type "any" rejected by Gemini)

fix branch — MCP tools work correctly:

What fix sends to OCI:
  metric_name:    {"type": "string", "enum": ["cpu","memory","disk","network"]}
  duration_hours: {"type": "integer", "minimum": 1, "maximum": 168}

meta.llama-3.3-70b:      metric_name = "network"  (valid)
google.gemini-2.5-flash: refused — "does not support latency"
google.gemini-2.5-pro:   refused — "can check: cpu, memory, disk, network"

Fix

  • Use args_schema.model_json_schema() instead of tool.args to get the original schema with constraints intact
  • _sanitize_tool_property() filters through an allowlist of standard JSON Schema keys (enum, format, min/max, pattern, items, etc.) and resolves Pydantic v2 anyOf patterns for Optional[T]
  • For Cohere: _enrich_description() embeds constraints into the description string since CohereParameterDefinition only supports type/description/is_required

Tests

  • 22 unit tests for _sanitize_tool_property (enum, format, ranges, patterns, nested objects, anyOf resolution, MCP round-trips)
  • 6 integration tests against live OCI (Meta Llama + Cohere): enum constraints, numeric ranges, schema verification

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jan 29, 2026
@fede-kamel fede-kamel force-pushed the provider-improvements branch from 2f25a71 to 9c291fb Compare January 29, 2026 21:24
@fede-kamel fede-kamel changed the title Provider improvements: tool schema handling, Google provider, reasoning content Fix tool schema constraints dropped during OCI tool conversion Jan 29, 2026
@fede-kamel fede-kamel force-pushed the provider-improvements branch from 9c291fb to 42469cc Compare January 29, 2026 21:49
@fede-kamel fede-kamel changed the title Fix tool schema constraints dropped during OCI tool conversion Fix tool schema constraints dropped during OCI conversion (breaks MCP + Gemini) Jan 29, 2026
@fede-kamel fede-kamel force-pushed the provider-improvements branch from 42469cc to ae4f0ce Compare January 29, 2026 23:02
convert_to_oci_tool uses tool.args which goes through Pydantic's
tool_call_schema re-generation, stripping enum, min/max, pattern,
format, and other JSON Schema properties. Additionally, Optional
fields (anyOf pattern) get converted to type "any" which is not
valid JSON Schema and causes Gemini models to return 400 errors.

Reproduced end-to-end with a FastMCP server defining a tool with
enum and range constraints. On main, all models hallucinate invalid
values and Gemini crashes with 400. With this fix, Llama 3.3 picks
the closest valid enum value, Gemini models refuse invalid inputs.

Fix: use args_schema.model_json_schema() to get the original schema,
add _sanitize_tool_property() to filter through an allowlist of
standard JSON Schema keys and resolve anyOf patterns.

For Cohere: add _enrich_description() to embed constraints into the
description string since CohereParameterDefinition only supports
type/description/is_required.
@fede-kamel fede-kamel force-pushed the provider-improvements branch from ae4f0ce to 67c936d Compare January 29, 2026 23:05
Rename test_model_with_image to run_model_with_image so pytest
does not collect it as a test function. It is a helper called
from main() when the script is run directly.
@fede-kamel fede-kamel force-pushed the provider-improvements branch from 67c936d to 0b3c6b8 Compare January 30, 2026 01:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant