Skip to content

bug(antigravity/claude): Thinking blocks returned as raw <thinking> XML tags instead of proper thinking content blocks #1777

@edxeth

Description

@edxeth

Is it a request payload issue?
[ ] Yes, this is a request payload issue. I am using a client/cURL to send a request payload, but I received an unexpected error.
[x] No, it's another issue.

Describe the bug

When using Claude models (Opus 4.6, Sonnet 4.6) through the Antigravity provider, thinking content is returned as raw text with <thinking> XML tags mixed into type: "text" content blocks, instead of proper type: "thinking" content blocks. Clients like OpenCode that render thinking blocks separately (dark grey, visually separated) cannot distinguish thinking from response text.

This works correctly for Gemini models on the same Antigravity backend (where the backend sets thought: true on thought parts), and also works correctly with providers that return native Anthropic format (e.g., Z.ai for GLM-5).

Root cause: The Antigravity backend for Claude models does not annotate thinking with thought: true in the response parts. Instead, Claude's thinking arrives as regular text wrapped in <thinking>...</thinking> XML tags. Since CLIProxyAPI's response translator relies exclusively on the thought: true flag, the thinking content passes through as regular text.

Image

CLI Type

Antigravity (Google backend for Claude models)

Model Name

claude-opus-4-6-thinking, claude-sonnet-4-6-thinking (via Antigravity)

LLM Client

OpenCode 1.2.15 with @ai-sdk/anthropic SDK

Request Information

# Streaming request with thinking enabled
curl -s http://localhost:8317/v1/messages \
  -H "x-api-key: <key>" -H "content-type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model":"claude-opus-4-6-thinking","max_tokens":500,"stream":true,"thinking":{"type":"enabled","budget_tokens":8192},"messages":[{"role":"user","content":"Solve: what is the square root of 252557324527? Show your work."}]}'

Streaming response shows only content_block_start with "type":"text" and text_delta events — zero thinking blocks or thinking_delta events, even when the model produces thinking content.

When thinking IS produced by the backend, it appears as raw <thinking>...</thinking> XML tags within the text content block, e.g.:

502,550² = 252,556,302,500
252,557,324,527 - 252,556,302,500 = 1,022,027
</thinking>

√252,557,324,527 ≈ 502,550.818

<thinking>
The square root is approximately 502,550.818.
</thinking>

Expected behavior

Thinking content should arrive as separate content blocks:

{
  "content": [
    { "type": "thinking", "thinking": "Let me calculate...", "signature": "..." },
    { "type": "text", "text": "√252,557,324,527 ≈ 502,550.818" }
  ]
}

This is what happens with native Anthropic format providers (e.g., Z.ai with GLM-5), where OpenCode renders thinking in dark grey, visually separated from the response.

Code Analysis

The response translator correctly handles thought: true when present — the issue is at the backend layer:

Response translator (internal/translator/antigravity/claude/antigravity_claude_response.go, line 136):

if partResult.Get("thought").Bool() {
    // Creates proper content_block_start with type:"thinking"
    // Creates thinking_delta events
    // ← This code path works correctly (verified with Gemini models)
} else {
    // Regular text content → type:"text" blocks
    // ← Claude thinking lands here because backend doesn't set thought: true
}

Executor (internal/runtime/executor/antigravity_executor.go, lines 652-664):

thought := part.Get("thought").Bool()
if thought || part.Get("text").Exists() {
    kind := "text"
    if thought {
        kind = "thought"
    }
    // Claude via Antigravity: thought is always false, kind is always "text"
}

Data flow comparison:

Provider Backend behavior Result
Antigravity/Claude Thinking embedded as <thinking> XML tags in regular text, no thought: true flag BROKEN: Raw XML tags in text blocks
Antigravity/Gemini Thinking annotated with thought: true OK: Proper thinking blocks
Z.ai/GLM-5 Native Anthropic format with type: "thinking" content blocks OK: Proper thinking blocks

Suggested Fix

Since the Antigravity backend doesn't annotate Claude's thinking with thought: true, CLIProxyAPI could work around this by detecting <thinking> XML tags in text content during response translation:

  1. In antigravity_claude_response.go, when processing text parts without thought: true, check for <thinking>...</thinking> patterns
  2. Split the text into segments and emit thinking segments as type: "thinking" content blocks
  3. Gate this behavior on Claude models only (check model name) to avoid false positives
  4. Handle streaming edge cases where <thinking> open/close tags may span multiple chunks

Alternatively, if there's a way to configure the Antigravity backend to return thought: true for Claude models (different API parameter or flag), that would be cleaner since the existing handler would work as-is.

OS Type

  • OS: Linux (WSL2)
  • Kernel: 6.6.87.2-microsoft-standard-WSL2

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    pendingWaiting for research

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions