Skip to content

Conversation

@jxy
Copy link
Contributor

@jxy jxy commented Nov 20, 2025

commit cfcc87a changed how response items are batched and recorded, which caused tool call objects to be recorded before the assistant message content in some turns.

This could result in a { role: "assistant", content: null, tool_calls: [...] } message being sent before the actual assistant message with text content, breaking ordering expectations downstream.

This change reorders the items collected for a single turn so that the last non-empty assistant message is placed immediately before the first tool call object when both occur in the same turn, without perturbing prior history.

A new unit test ensures that recorded items now appear in the expected order: assistant text, then tool call object, then the corresponding tool output input item.

commit cfcc87a changed how response items are batched and recorded, which caused tool call objects to be recorded before the assistant message content in some turns.

This could result in a { role: "assistant", content: null, tool_calls: [...] } message being sent before the actual assistant message with text content, breaking ordering expectations downstream.

This change reorders the items collected for a single turn so that the last non-empty assistant message is placed immediately before the first tool call object when both occur in the same turn, without perturbing prior history.

A new unit test ensures that recorded items now appear in the expected order: assistant text, then tool call object, then the corresponding tool output input item.
@etraut-openai
Copy link
Collaborator

Thanks for the contribution.

Is there an open bug report associated with this PR? We ask that all bug fix PRs have a linked bug report (see contribution guidelines).

It's not clear to me what problem this PR is solving. You mentioned that the recent change is "breaking ordering expectations downstream". Whose expectations, and how does that manifest?

@etraut-openai etraut-openai added the needs-response Additional information is requested label Nov 20, 2025
@jxy
Copy link
Contributor Author

jxy commented Nov 21, 2025

It only affects chat/completions endpoint with OpenAI compatible API servers.

With the current codex:

codex-cli 0.61.0

Test with the following. Run a test server (needs nc and jq):

{
cat <<'EOF'
HTTP/1.1 200 OK
Content-Type: text/event-stream

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"content":"ASSISTANT MESSAGE HERE"},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"content":null,"tool_calls":[{"index":0,"id":"call_testid","type":"function","function":{"name":"shell","arguments":""}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"command\":[\"bash\",\"-lc\",\"uname\"],\"timeout_ms\":120000,\"workdir\":\"/tmp\"}"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}

data: [DONE]
EOF
} | nc -l 8888 | sed '/^[^{]/d' | jq '.messages[2:]'
{
cat <<'EOF'
HTTP/1.1 200 OK
Content-Type: text/event-stream

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"content":"CHECK WHAT I RECEIVED"},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
EOF
} | nc -l 8888 | sed '/^[^{]/d' | jq '.messages[2:]'

From another terminal connect with codex

codex \
-c 'profiles.test.model_provider="test"' \
-c 'model_providers.test.name="test"' \
-c 'model_providers.test.base_url="http://localhost:8888/v1"' \
-p test --full-auto \
'A quick test.'

The our nc server will print the following:

[
  {
    "role": "user",
    "content": "A quick test."
  }
]
[
  {
    "role": "user",
    "content": "A quick test."
  },
  {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_testid",
        "type": "function",
        "function": {
          "name": "shell",
          "arguments": "{\"command\":[\"bash\",\"-lc\",\"uname\"],\"timeout_ms\":120000,\"workdir\":\"/tmp\"}"
        }
      }
    ]
  },
  {
    "role": "assistant",
    "content": "ASSISTANT MESSAGE HERE"
  },
  {
    "role": "tool",
    "tool_call_id": "call_testid",
    "content": "Exit code: 0\nWall time: 0 seconds\nOutput:\nDarwin\n"
  }
]

You can see that the "ASSISTANT MESSAGE HERE" was sent after the tool_calls request. Previous versions had the assistant content come before tool_calls, which should be the correct order. Bisecting the commits led me to cfcc87a, which intended to move tool_calls before the results of the tool call, yet neglected the case where assistant may send non-empty content that got placed after tool_calls.

This commit moves assistant messages, if there's any, before the tool_calls. And the unit test assures that. With this change, you should see the correct order

[
  {
    "role": "user",
    "content": "A quick test."
  }
]
[
  {
    "role": "user",
    "content": "A quick test."
  },
  {
    "role": "assistant",
    "content": "ASSISTANT MESSAGE HERE"
  },
  {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_testid",
        "type": "function",
        "function": {
          "name": "shell",
          "arguments": "{\"command\":[\"bash\",\"-lc\",\"uname\"],\"timeout_ms\":120000,\"workdir\":\"/tmp\"}"
        }
      }
    ]
  },
  {
    "role": "tool",
    "tool_call_id": "call_testid",
    "content": "Exit code: 0\nWall time: 0 seconds\nOutput:\nDarwin\n"
  }
]

@jxy
Copy link
Contributor Author

jxy commented Nov 21, 2025

And I just saw a bug report.

This PR should fix #7051

@211211
Copy link

211211 commented Nov 21, 2025

codex --version
codex-cli 0.61.0

I'm using [email protected] version via brew cask on my local, always face error as following, is there a way to switch back to previous cask version? I tried but couldn't install by using "brew install --cask [email protected]" or so.
{ "error": { "message": "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_2hWYykIE9bKkFT6AFNOPseAN", "type": "invalid_request_error", "param": "messages.[5].role", "code": null } }

@448523760
Copy link
Contributor

I believe this issue is model-dependent. When the messages are arranged in the sequence: [assistant (tool call), assistant (content), tool (content)]

Some models return an error while others do not. For example, when I used glm-4.6 and sent requests in that order, no error occurred. When I submitted the same request to deepseek-chat, it returned an HTTP 400 response with the following error:

ERROR: unexpected status 400 Bad Request: {"error":{"message":"An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. (insufficient tool messages following tool_calls message)","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

But it indeed affect the GPT-series models.

Below is the request log I captured from Codex (version: 0.63.0):

{
    "model": "glm-4.6",
    "messages": [
        {
            "role": "system",
            "content": "You are a coding agent running in the Codex CLI, a terminal-based coding assistant. Codex CLI is....."
        },
        {
            "role": "user",
            "content": "<environment_context>...</environment_context>"
        },
        {
            "role": "user",
            "content": "write foo into bar file and save in the current location"
        },
        {
            "role": "assistant",
            "content": null,
            "tool_calls": [
                {
                    "id": "call_c64a59b760fa422e8e603a2f",
                    "type": "function",
                    "function": {
                        "name": "exec_command",
                        "arguments": "{\"cmd\":\"echo \\\"foo\\\" > bar\"}"
                    }
                }
            ]
        },
        {
            "role": "assistant",
            "content": "\nI'll create a file named \"bar\" with the content \"foo\" in the current location.\n"
        },
        {
            "role": "tool",
            "tool_call_id": "call_c64a59b760fa422e8e603a2f",
            "content": "Chunk ID: 987afe\nWall time: 0.0261 seconds\nProcess exited with code 0\nOriginal token count: 0\nOutput:\n"
        }
    ],
    "stream": true,
    "tools": [...]
}

And the Codex can finish the above task with glm only.

Below is my model config:

[model_providers.glm_4_6]
name = "glm_4_6"
base_url = "https://open.bigmodel.cn/api/coding/paas/v4"
env_key = "GLM_API_KEY"
wire_api = "chat"
query_params = {}
[model_providers.deepseek-chat]
name = "DeepSeek"
base_url = "https://api.deepseek.com"
env_key = "DEEPSEEK_API_KEY"
wire_api = "chat"
query_params = {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants