Skip to content

Conversation

@andrewsanchez
Copy link

@andrewsanchez andrewsanchez commented Nov 3, 2025

This is intended to close #1222.

However, I think this raises some questions regarding the correct/expected behavior, so I'm a bit stuck regarding how to test this.

Here is my understanding of the correct behavior: If both a schema and a tool(s) are provided, we will always have a chain of

  • two or more LLM requests
    • first, a request with tool definitions
    • second, a request with the tool execution results
  • two or more responses
    • first, a response with the tool call
    • second, a final response that considers the tool execution results

When a structured output is desired, I have the following assumptions about the correct behavior:

The initial LLM call
The expected response is one or more tool calls, not a structured output (this is expected in a subsequent response after we send the tool call results back to the LLM). Therefore, I wonder should the schema be excluded from this initial request?

Even if we do include the schema in all requests, I think it will work just fine. At least, that is what I have observed so far with gpt-5. However, for explicit clarity, I wonder if it would be better not to send the schema with the first request.

Here are two examples to illustrate.

Send the schema with the initial and subsequent requests

➜ LLM_OPENAI_SHOW_RESPONSES=1 uv run llm --tool llm_time "What is the time?" --td --no-stream --schema "date: current date"
Request: POST https://api.openai.com/v1/chat/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 2.6.1
    x-stainless-lang: python
    x-stainless-package-version: 2.6.1
    x-stainless-os: MacOS
    x-stainless-arch: arm64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.13.5
    authorization: [...]
    x-stainless-async: false
    x-stainless-retry-count: 0
    x-stainless-read-timeout: 600
    content-length: 481
  Body:
    {
      "messages": [
        {
          "role": "user",
          "content": "What is the time?"
        }
      ],
      "model": "gpt-5",
      "reasoning_effort": "minimal",
      "response_format": {
        "type": "json_schema",
        "json_schema": {
          "name": "output",
          "schema": {
            "type": "object",
            "properties": {
              "date": {
                "type": "string",
                "description": "current date"
              }
            },
            "required": [
              "date"
            ]
          }
        }
      },
      "stream": false,
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "llm_time",
            "description": "Returns the current time, as local time and UTC",
            "parameters": {
              "properties": {},
              "type": "object"
            }
          }
        }
      ]
    }
Response: status_code=200
  Headers:
    date: Mon, 17 Nov 2025 16:33:46 GMT
    content-type: application/json
    content-length: 1024
    connection: keep-alive
    access-control-expose-headers: X-Request-ID
    openai-organization: ankihub-lhcs7y
    openai-processing-ms: 2283
    openai-project: proj_Mg8izjsUoAvN6d6P4UOO1cr3
    openai-version: 2020-10-01
    x-envoy-upstream-service-time: 2320
    x-ratelimit-limit-requests: 15000
    x-ratelimit-limit-tokens: 40000000
    x-ratelimit-remaining-requests: 14999
    x-ratelimit-remaining-tokens: 39999993
    x-ratelimit-reset-requests: 4ms
    x-ratelimit-reset-tokens: 0s
    x-request-id: req_24ca5591155a446ea45f9c3b4feeff52
    x-openai-proxy-wasm: v0.1
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: 9a00a1aa1c918538-EWR
    alt-svc: h3=":443"; ma=86400
  Body:
{
  "id": "chatcmpl-CcwSeiTlbIlFbBSwDj8eIQQmWHEDO",

 "object": "chat.completion",
  "created": 1763397224,
  "model": "gpt-5-2025-08-07",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_Ft3F3KKA3RWFHrgh0gq3TMLm",
            "type": "function",
            "function": {
              "name": "llm_time",
              "arguments": "{}"
            }
          }
        ],
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 155,
    "completion_tokens": 20,
    "total_tokens": 175,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": null
}


Tool call: llm_time({})
  {
    "utc_time": "2025-11-17 16:33:46 UTC",
    "utc_time_iso": "2025-11-17T16:33:46.443968+00:00",
    "local_timezone": "EST",
    "local_time": "2025-11-17 11:33:46",
    "timezone_offset": "UTC-5:00",
    "is_dst": false
  }

Request: POST https://api.openai.com/v1/chat/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 2.6.1
    x-stainless-lang: python
    x-stainless-package-version: 2.6.1
    x-stainless-os: MacOS
    x-stainless-arch: arm64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.13.5
    authorization: [...]
    x-stainless-async: false
    x-stainless-retry-count: 0
    x-stainless-read-timeout: 600
    content-length: 921
  Body:
    {
      "messages": [
        {
          "role": "user",
          "content": "What is the time?"
        },
        {
          "role": "assistant",
          "tool_calls": [
            {
              "type": "function",
              "id": "call_Ft3F3KKA3RWFHrgh0gq3TMLm",
              "function": {
                "name": "llm_time",
                "arguments": "{}"
              }
            }
          ]
        },
        {
          "role": "tool",
          "tool_call_id": "call_Ft3F3KKA3RWFHrgh0gq3TMLm",
          "content": "{\"utc_time\": \"2025-11-17 16:33:46 UTC\", \"utc_time_iso\": \"2025-11-17T16:33:46.443968+00:00\", \"local_timezone\": \"EST\", \"local_time\": \"2025-11-17 11:33:46\", \"timezone_offset\": \"UTC-5:00\", \"is_dst\": false}"
        }
      ],
      "model": "gpt-5",
      "reasoning_effort": "minimal",
      "response_format": {
        "type": "json_schema",
        "json_schema": {
          "name": "output",
          "schema": {
            "type": "object",
            "properties": {
              "date": {
                "type": "string",
                "description": "current date"
              }
            },
            "required": [
              "date"
            ]
          }
        }
      },
      "stream": false,
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "llm_time",
            "description": "Returns the current time, as local time and UTC",
            "parameters": {
              "properties": {},
              "type": "object"
            }
          }
        }
      ]
    }
Response: status_code=200
  Headers:
    date: Mon, 17 Nov 2025 16:33:48 GMT
    content-type: application/json
    content-length: 859
    connection: keep-alive
    access-control-expose-headers: X-Request-ID
    openai-organization: ankihub-lhcs7y
    openai-processing-ms: 1902
    openai-project: proj_Mg8izjsUoAvN6d6P4UOO1cr3
    openai-version: 2020-10-01
    x-envoy-upstream-service-time: 1938
    x-ratelimit-limit-requests: 15000
    x-ratelimit-limit-tokens: 40000000
    x-ratelimit-remaining-requests: 14999
    x-ratelimit-remaining-tokens: 39999941
    x-ratelimit-reset-requests: 4ms
    x-ratelimit-reset-tokens: 0s
    x-request-id: req_99c4a99cbe8c4ebe94b03b7cd9298745
    x-openai-proxy-wasm: v0.1
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: 9a00a1ba1d61c47d-EWR
    alt-svc: h3=":443"; ma=86400
  Body:
{
  "id": "chatcmpl-CcwSgderNakid0y8tjMi7rznJkBfH",

"object": "chat.completion",
  "created": 1763397226,
  "model": "gpt-5-2025-08-07",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"date\":\"Local time: 2025-11-17 11:33:46 (EST, UTC-5). UTC time: 2025-11-17 16:33:46.\"}",
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 268,
    "completion_tokens": 50,
    "total_tokens": 318,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": null
}

{"date":"Local time: 2025-11-17 11:33:46 (EST, UTC-5). UTC time: 2025-11-17 16:33:46."}

Exclude the schema from the initial requests

➜ LLM_OPENAI_SHOW_RESPONSES=1 uv run llm --tool llm_time "Call the llm_time tool to calculate the time. Only respond with the tool call." --td --no-stream
Request: POST https://api.openai.com/v1/chat/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 2.6.1
    x-stainless-lang: python
    x-stainless-package-version: 2.6.1
    x-stainless-os: MacOS
    x-stainless-arch: arm64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.13.5
    authorization: [...]
    x-stainless-async: false
    x-stainless-retry-count: 0
    x-stainless-read-timeout: 600
    content-length: 353
  Body:
    {
      "messages": [
        {
          "role": "user",
          "content": "Call the llm_time tool to calculate the time. Only respond with the tool call."
        }
      ],
      "model": "gpt-5",
      "reasoning_effort": "minimal",
      "stream": false,
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "llm_time",
            "description": "Returns the current time, as local time and UTC",
            "parameters": {
              "properties": {},
              "type": "object"
            }
          }
        }
      ]
    }
Response: status_code=200
  Headers:
    date: Mon, 03 Nov 2025 16:46:36 GMT
    content-type: application/json
    content-length: 1024
    connection: keep-alive
    access-control-expose-headers: X-Request-ID
    openai-organization: ankihub-lhcs7y
    openai-processing-ms: 2174
    openai-project: proj_Mg8izjsUoAvN6d6P4UOO1cr3
    openai-version: 2020-10-01
    x-envoy-upstream-service-time: 2203
    x-ratelimit-limit-requests: 15000
    x-ratelimit-limit-tokens: 40000000
    x-ratelimit-remaining-requests: 14999
    x-ratelimit-remaining-tokens: 39999978
    x-ratelimit-reset-requests: 4ms
    x-ratelimit-reset-tokens: 0s
    x-request-id: req_fe6cac33132d4cb7bda57d2469bd223c
    x-openai-proxy-wasm: v0.1
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: 998d59383a9b6180-EWR
    alt-svc: h3=":443"; ma=86400
  Body:
{
  "id": "chatcmpl-CXrzOJA1jkbwzxQ2qpxUwSEsiq7q4",

 "object": "chat.completion",
  "created": 1762188394,
  "model": "gpt-5-2025-08-07",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_jz4QuW0Dtrqvqbbre3uYEKe2",
            "type": "function",
            "function": {
              "name": "llm_time",
              "arguments": "{}"
            }
          }
        ],
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 142,
    "completion_tokens": 20,
    "total_tokens": 162,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": null
}


Tool call: llm_time({})
  {
    "utc_time": "2025-11-03 16:46:36 UTC",
    "utc_time_iso": "2025-11-03T16:46:36.613879+00:00",
    "local_timezone": "EST",
    "local_time": "2025-11-03 11:46:36",
    "timezone_offset": "UTC-5:00",
    "is_dst": false
  }

Request: POST https://api.openai.com/v1/chat/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 2.6.1
    x-stainless-lang: python
    x-stainless-package-version: 2.6.1
    x-stainless-os: MacOS
    x-stainless-arch: arm64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.13.5
    authorization: [...]
    x-stainless-async: false
    x-stainless-retry-count: 0
    x-stainless-read-timeout: 600
    content-length: 793
  Body:
    {
      "messages": [
        {
          "role": "user",
          "content": "Call the llm_time tool to calculate the time. Only respond with the tool call."
        },
        {
          "role": "assistant",
          "tool_calls": [
            {
              "type": "function",
              "id": "call_jz4QuW0Dtrqvqbbre3uYEKe2",
              "function": {
                "name": "llm_time",
                "arguments": "{}"
              }
            }
          ]
        },
        {
          "role": "tool",
          "tool_call_id": "call_jz4QuW0Dtrqvqbbre3uYEKe2",
          "content": "{\"utc_time\": \"2025-11-03 16:46:36 UTC\", \"utc_time_iso\": \"2025-11-03T16:46:36.613879+00:00\", \"local_timezone\": \"EST\", \"local_time\": \"2025-11-03 11:46:36\", \"timezone_offset\": \"UTC-5:00\", \"is_dst\": false}"
        }
      ],
      "model": "gpt-5",
      "reasoning_effort": "minimal",
      "stream": false,
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "llm_time",
            "description": "Returns the current time, as local time and UTC",
            "parameters": {
              "properties": {},
              "type": "object"
            }
          }
        }
      ]
    }
Response: status_code=200
  Headers:
    date: Mon, 03 Nov 2025 16:46:39 GMT
    content-type: application/json
    content-length: 829
    connection: keep-alive
    access-control-expose-headers: X-Request-ID
    openai-organization: ankihub-lhcs7y
    openai-processing-ms: 2264
    openai-project: proj_Mg8izjsUoAvN6d6P4UOO1cr3
    openai-version: 2020-10-01
    x-envoy-upstream-service-time: 2293
    x-ratelimit-limit-requests: 15000
    x-ratelimit-limit-tokens: 40000000
    x-ratelimit-remaining-requests: 14999
    x-ratelimit-remaining-tokens: 39999926
    x-ratelimit-reset-requests: 4ms
    x-ratelimit-reset-tokens: 0s
    x-request-id: req_22384fc8819f481c86054335419af686
    x-openai-proxy-wasm: v0.1
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: 998d59472efa43ff-EWR
    alt-svc: h3=":443"; ma=86400
  Body:
{
  "id": "chatcmpl-CXrzQKaBIhGXj3lAQore0CCZUNORr",

"object": "chat.completion",
  "created": 1762188396,
  "model": "gpt-5-2025-08-07",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"recipient_name\":\"functions.llm_time\",\"parameters\":{}}",
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 255,
    "completion_tokens": 16,
    "total_tokens": 271,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": null
}

{"recipient_name":"functions.llm_time","parameters":{}}


llm on  fix-1222-tools-with-schemas-bug [!?] using ☁️  default/ankihub-ai took 5.6s
➜ LLM_OPENAI_SHOW_RESPONSES=1 uv run llm -c --no-stream --schema "date: current date"
Request: POST https://api.openai.com/v1/chat/completions
  Headers:
    host: api.openai.com
    connection: keep-alive
    accept: application/json
    content-type: application/json
    user-agent: OpenAI/Python 2.6.1
    x-stainless-lang: python
    x-stainless-package-version: 2.6.1
    x-stainless-os: MacOS
    x-stainless-arch: arm64
    x-stainless-runtime: CPython
    x-stainless-runtime-version: 3.13.5
    authorization: [...]
    x-stainless-async: false
    x-stainless-retry-count: 0
    x-stainless-read-timeout: 600
    content-length: 1077
  Body:
    {
      "messages": [
        {
          "role": "user",
          "content": "Call the llm_time tool to calculate the time. Only respond with the tool call."
        },
        {
          "role": "assistant",
          "tool_calls": [
            {
              "type": "function",
              "id": "call_jz4QuW0Dtrqvqbbre3uYEKe2",
              "function": {
                "name": "llm_time",
                "arguments": "{}"
              }
            }
          ]
        },
        {
          "role": "tool",
          "tool_call_id": "call_jz4QuW0Dtrqvqbbre3uYEKe2",
          "content": "{\"utc_time\": \"2025-11-03 16:46:36 UTC\", \"utc_time_iso\": \"2025-11-03T16:46:36.613879+00:00\", \"local_timezone\": \"EST\", \"local_time\": \"2025-11-03 11:46:36\", \"timezone_offset\": \"UTC-5:00\", \"is_dst\": false}"
        },
        {
          "role": "assistant",
          "content": "{\"recipient_name\":\"functions.llm_time\",\"parameters\":{}}"
        }
      ],
      "model": "gpt-5",
      "reasoning_effort": "minimal",
      "response_format": {
        "type": "json_schema",
        "json_schema": {
          "name": "output",
          "schema": {
            "type": "object",
            "properties": {
              "date": {
                "type": "string",
                "description": "current date"
              }
            },
            "required": [
              "date"
            ]
          }
        }
      },
      "stream": false,
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "llm_time",
            "description": "Returns the current time, as local time and UTC",
            "parameters": {
              "properties": {},
              "type": "object"
            }
          }
        }
      ]
    }
Response: status_code=200
  Headers:
    date: Mon, 03 Nov 2025 16:46:55 GMT
    content-type: application/json
    content-length: 793
    connection: keep-alive
    access-control-expose-headers: X-Request-ID
    openai-organization: ankihub-lhcs7y
    openai-processing-ms: 1944
    openai-project: proj_Mg8izjsUoAvN6d6P4UOO1cr3
    openai-version: 2020-10-01
    x-envoy-upstream-service-time: 1959
    x-ratelimit-limit-requests: 15000
    x-ratelimit-limit-tokens: 40000000
    x-ratelimit-remaining-requests: 14999
    x-ratelimit-remaining-tokens: 39999910
    x-ratelimit-reset-requests: 4ms
    x-ratelimit-reset-tokens: 0s
    x-request-id: req_581e30404a104daca01010a8861939d2
    x-openai-proxy-wasm: v0.1
    cf-cache-status: DYNAMIC
    set-cookie: __cf_bm=...
    strict-transport-security: max-age=31536000; includeSubDomains; preload
    x-content-type-options: nosniff
    server: cloudflare
    cf-ray: 998d59af1da40c23-EWR
    alt-svc: h3=":443"; ma=86400
  Body:
{
  "id": "chatcmpl-CXrzhVveKh0dJAwSm6Foc8fzG4wlW",

"object": "chat.completion",
  "created": 1762188413,
  "model": "gpt-5-2025-08-07",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"date\":\"2025-11-03\"}",
        "refusal": null,
        "annotations": []
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 303,
    "completion_tokens": 22,
    "total_tokens": 325,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default",
  "system_fingerprint": null
}

{"date":"2025-11-03"}

As you can see, the final result is nearly the same. But if we include the schema in the first call, we are trusting the LLM to make the right decision: ignore the schema and simply respond with a tool call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using tools breaks schema

1 participant