refactor: Move Anthropic translation layer to library #47

mirrobot-agent · 2025-12-19T19:42:46Z

Summary

This PR refactors the Anthropic endpoint support from PR #45 by moving the translation layer into the rotator_library as a proper, reusable module.

Related to: #45

Changes

New Library Module: `rotator_library/anthropic_compat/`

models.py: Pydantic models for Anthropic API (requests, responses, content blocks)
translator.py: Format translation functions between Anthropic and OpenAI formats
streaming.py: Framework-agnostic streaming wrapper that converts OpenAI SSE to Anthropic SSE
__init__.py: Public exports

Updated `rotator_library/client.py`

Added two new methods to RotatingClient:

anthropic_messages() - Handle Anthropic Messages API requests
anthropic_count_tokens() - Handle token counting

Simplified `proxy_app/main.py`

Removed ~663 lines of local Anthropic code
Now imports models and functions from rotator_library.anthropic_compat
Endpoints work the same way but use library components
Fixed verify_anthropic_api_key to support open access mode

Benefits

Reusability: The Anthropic translation layer can now be used by other applications using rotator_library
Maintainability: Clear separation between library code and application code
Testability: Library components can be unit tested independently
Consistency: Follows the existing library architecture patterns

Files Changed

src/rotator_library/
├── anthropic_compat/
│   ├── __init__.py          (NEW)
│   ├── models.py            (NEW)
│   ├── translator.py        (NEW)
│   └── streaming.py         (NEW)
├── client.py                (MODIFIED)
└── __init__.py              (MODIFIED)

src/proxy_app/
└── main.py                  (MODIFIED)

Important

Refactor Anthropic translation layer into a reusable library module for improved code reusability and maintainability.

New Library Module: rotator_library/anthropic_compat/
- models.py: Pydantic models for Anthropic API requests and responses.
- translator.py: Functions for translating between Anthropic and OpenAI formats.
- streaming.py: Converts OpenAI SSE to Anthropic SSE.
Client Updates in client.py:
- Added anthropic_messages() and anthropic_count_tokens() methods to RotatingClient.
Main Application Simplification in main.py:
- Removed ~663 lines of Anthropic-specific code.
- Now uses rotator_library.anthropic_compat for Anthropic functionality.
- Updated verify_anthropic_api_key() to support open access mode.

^{This description was created by}^{for 9d30ea6. You can customize this summary. It will automatically update as commits are pushed.}

…atibility - Add /v1/messages endpoint with Anthropic-format request/response - Support both x-api-key and Bearer token authentication - Implement Anthropic <-> OpenAI format translation for messages, tools, and responses - Add streaming wrapper converting OpenAI SSE to Anthropic SSE events - Handle tool_use blocks with proper stop_reason detection - Fix NoneType iteration bug in tool_calls handling

- Add AnthropicThinkingConfig model and thinking parameter to request - Translate Anthropic thinking config to reasoning_effort for providers - Handle reasoning_content in streaming wrapper (thinking_delta events) - Convert reasoning_content to thinking blocks in non-streaming responses

When no thinking config is provided in the request, Opus models now automatically use reasoning_effort=high with custom_reasoning_budget=True. This ensures Opus 4.5 uses the full 32768 token thinking budget instead of the backend's auto mode (thinkingBudget: -1) which may use less. Opus always uses the -thinking variant regardless, but this change guarantees maximum thinking capacity for better reasoning quality.

…ling - Add validation to ensure maxOutputTokens > thinkingBudget for Claude extended thinking (prevents 400 INVALID_ARGUMENT API errors) - Improve streaming error handling to send proper message_start and content blocks before error event for better client compatibility - Minor code formatting improvements

Track each tool_use block index separately and emit content_block_stop for all blocks (thinking, text, and each tool_use) when stream ends. Fixes Claude Code stopping mid-action due to malformed streaming events.

…nabled - Fixed bug where budget_tokens between 10000-32000 would get ÷4 reduction - Now any explicit thinking request sets custom_reasoning_budget=True - Added logging to show thinking budget, effort level, and custom_budget flag - Simplified budget tier logic (removed redundant >= 32000 check) Before: 31999 tokens requested → 8192 tokens actual (÷4 applied) After: 31999 tokens requested → 32768 tokens actual (full "high" budget)

When using /v1/chat/completions with Opus and reasoning_effort="high" or "medium", automatically set custom_reasoning_budget=true to get full thinking tokens instead of the ÷4 reduced default. This makes the OpenAI endpoint behave consistently with the Anthropic endpoint for Opus models - if you're using Opus with high reasoning, you want the full thinking budget. Adds logging: "🧠 Thinking: auto-enabled custom_reasoning_budget for Opus"

…ponses

…treaming Claude Code and other Anthropic SDK clients require message_start to be sent before any other SSE events. When a stream completed quickly without content chunks, the wrapper would send message_stop without message_start, causing clients to silently discard all output.

Signed-off-by: Moeeze Hassan <[email protected]>

This reverts commit e80645e.

…ing is enabled" This reverts commit 2ee549d.

Extract Anthropic API models and format translation functions from main.py into reusable library module: - models.py: Pydantic models for Anthropic Messages API (request/response) - translator.py: Functions to convert between Anthropic and OpenAI formats - anthropic_to_openai_messages() - anthropic_to_openai_tools() - anthropic_to_openai_tool_choice() - openai_to_anthropic_response() - translate_anthropic_request() - High-level request translation This is part of the refactoring to make Anthropic compatibility a proper library feature.

Add framework-agnostic streaming wrapper for Anthropic format: - streaming.py: Converts OpenAI SSE format to Anthropic SSE format - Handles message_start, content_block_start/delta/stop, message_delta, message_stop - Supports text, thinking, and tool_use content blocks - Uses callback-based disconnect detection instead of FastAPI Request - Proper error handling with client-visible error blocks - __init__.py: Export all models, translator functions, and streaming wrapper The streaming wrapper is now reusable outside of FastAPI.

…ethods Add high-level Anthropic API methods to RotatingClient: - anthropic_messages(): Handle Anthropic Messages API requests - Accepts AnthropicMessagesRequest, translates to OpenAI format - Routes through existing acompletion() with full retry/rotation logic - Returns response in Anthropic format (streaming or non-streaming) - anthropic_count_tokens(): Handle token counting for Anthropic requests - Counts tokens for messages and tools - Returns count in Anthropic format These methods enable any application using rotator_library to support Anthropic clients without needing to implement format translation.

…rary Add lazy-loaded exports for Anthropic API models: - AnthropicMessagesRequest - AnthropicMessagesResponse - AnthropicCountTokensRequest - AnthropicCountTokensResponse These can now be imported directly from rotator_library: from rotator_library import AnthropicMessagesRequest

Refactor main.py to use the anthropic_compat library module: - Import models from rotator_library.anthropic_compat instead of local definitions - Import translation functions from the library - Remove ~130 lines of local Anthropic model definitions - Keep endpoints functional by importing needed functions The endpoints still work the same way, but now use the library's reusable components. This is a stepping stone - the full client methods (anthropic_messages, anthropic_count_tokens) can be used for even simpler endpoints in the future. Net reduction: ~130 lines removed from main.py

mirrobot-agent · 2025-12-19T19:43:28Z

Time to examine my own handiwork! Let's see what past-me was thinking when moving 1,300+ lines of Anthropic compatibility code into the library. This should be interesting... 🔍

mirrobot-agent

Self-Review Assessment

Well, well, well... time to grade my own homework! 📝

Overall, this is a solid refactoring that moves the Anthropic compatibility layer from proxy_app into rotator_library as a reusable module. The architecture follows the existing library patterns, and the separation of concerns is clear. Past-me did a decent job here... mostly.

Architectural Reflections

The module structure (models.py, translator.py, streaming.py) is clean and intuitive. I particularly like that:

The streaming wrapper is framework-agnostic (accepts an optional disconnect callback rather than a FastAPI Request)
The lazy-loading in __init__.py maintains fast import times
The new RotatingClient methods provide a clean interface for Anthropic requests

Key Fixes I Should Make

Code duplication in main.py: I created a nice translate_anthropic_request() function but then... didn't use it in the endpoint? Classic past-me.
Type validation in models: The AnthropicThinkingConfig.type field accepts any string. Should use Literal["enabled", "disabled"] for proper validation.
Silent JSON failures: When parsing tool call arguments fails, I silently return an empty dict. Adding a debug log would make troubleshooting much easier.

Things That Actually Look Good

The antigravity_provider.py addition for validating max_tokens > thinkingBudget is a good defensive check
Error handling in the streaming wrapper properly ensures message_start is sent before any error events
The token counting endpoint correctly handles both messages and tool definitions

Overall, this PR is ready to merge once the minor improvements above are addressed. Future-me thanks past-me for the decent documentation and clean module structure!

This self-review was generated by an AI assistant.

mirrobot-agent · 2025-12-19T19:45:36Z

src/proxy_app/main.py

+        # Convert Anthropic request to OpenAI format
+        anthropic_request = body.model_dump(exclude_none=True)
+
+        openai_messages = anthropic_to_openai_messages(
+            anthropic_request.get("messages", []), anthropic_request.get("system")
+        )
+
+        openai_tools = anthropic_to_openai_tools(anthropic_request.get("tools"))
+        openai_tool_choice = anthropic_to_openai_tool_choice(
+            anthropic_request.get("tool_choice")
+        )
+
+        # Build OpenAI-compatible request
+        openai_request = {
+            "model": body.model,
+            "messages": openai_messages,
+            "max_tokens": body.max_tokens,
+            "stream": body.stream or False,
+        }
+
+        if body.temperature is not None:
+            openai_request["temperature"] = body.temperature
+        if body.top_p is not None:
+            openai_request["top_p"] = body.top_p
+        if body.stop_sequences:
+            openai_request["stop"] = body.stop_sequences
+        if openai_tools:
+            openai_request["tools"] = openai_tools
+        if openai_tool_choice:
+            openai_request["tool_choice"] = openai_tool_choice
+
+        # Handle Anthropic thinking config -> reasoning_effort translation
+        if body.thinking:
+            if body.thinking.type == "enabled":
+                # Map budget_tokens to reasoning_effort level
+                # Default to "medium" if enabled but budget not specified
+                budget = body.thinking.budget_tokens or 10000
+                if budget >= 32000:
+                    openai_request["reasoning_effort"] = "high"
+                    openai_request["custom_reasoning_budget"] = True
+                elif budget >= 10000:
+                    openai_request["reasoning_effort"] = "high"
+                elif budget >= 5000:
+                    openai_request["reasoning_effort"] = "medium"
+                else:
+                    openai_request["reasoning_effort"] = "low"
+            elif body.thinking.type == "disabled":
+                openai_request["reasoning_effort"] = "disable"
+        elif "opus" in body.model.lower():
+            # Force high thinking for Opus models when no thinking config is provided
+            # Opus 4.5 always uses the -thinking variant, so we want maximum thinking budget
+            # Without this, the backend defaults to thinkingBudget: -1 (auto) instead of high
+            openai_request["reasoning_effort"] = "high"
+            openai_request["custom_reasoning_budget"] = True


Ah, it seems past-me got a bit ahead of himself! I specifically created translate_anthropic_request() in the library to handle this translation, but then proceeded to duplicate the logic here manually.

The thinking budget mapping logic (lines 1067-1089) is nearly identical to what I put in translator.py (lines 283-304). This could lead to drift if one is updated without the other.

I should either:

Use translate_anthropic_request() directly, or

Document why the endpoint needs different handling than the library method

mirrobot-agent · 2025-12-19T19:45:36Z

src/rotator_library/anthropic_compat/models.py

+class AnthropicThinkingConfig(BaseModel):
+    """Anthropic thinking configuration."""
+
+    type: str  # "enabled" or "disabled"
+    budget_tokens: Optional[int] = None


Past-me was feeling a bit lazy here. The type field should be constrained to valid values ("enabled" or "disabled"). Using a plain str allows invalid values to slip through validation.

Consider using:

from typing import Literal # ... type: Literal["enabled", "disabled"]

This gives proper IDE autocomplete and catches typos at validation time.

mirrobot-agent · 2025-12-19T19:45:36Z

src/rotator_library/anthropic_compat/translator.py

+        try:
+            input_data = json.loads(func.get("arguments", "{}"))
+        except json.JSONDecodeError:
+            input_data = {}


My past-self used a silent fallback here - while pragmatic, it could mask issues where tool arguments are malformed. When tools mysteriously receive empty inputs, this will be hard to debug.

A quick logging statement would help future-me:

except json.JSONDecodeError as e: logging.debug(f"Failed to parse tool arguments: {e}") input_data = {}

b3nw · 2026-01-09T04:44:48Z

BUG Report: Anthropic Compatibility & Claude Code Integration

Executive Summary

This document tracks critical bugs identified and resolved during the testing of PR #47 (Anthropic Compatibility) when used with the Claude Code v2.1.2 CLI client.

Resolved Issues:

Antigravity 400 Error (Tool Schema): Prevented crashes during tool-use by stripping unsupported enumDescriptions from JSON schemas.
Streaming Wrapper TypeError: Fixed a 500 Internal Server Error caused by a parameter mismatch in the async streaming generator.
Redundant Translation Logic: Refactored main.py to use core library methods, ensuring consistent behavior and easier maintenance.
Pydantic Validation Failures: Relaxed validation to allow "extra" fields, supporting Claude Code's newer beta features like interleaved-thinking.

1. Antigravity API 400 Error (Tool Schemas)

Issue Description

Requests failed with a 400 Bad Request error from the Antigravity backend whenever Claude Code attempted to use tools (indexing, bash commands, etc.).

Root Cause

The Antigravity API is a strict Proto-based interface. Claude Code includes an enumDescriptions field in its tool parameter schemas which is not part of the standard Google Gemini tool specification.

Error Message

"message": "Invalid JSON payload received. Unknown name \"enumDescriptions\" at 'request.tools[0]... Cannot find field."

Resolution

Modified antigravity_provider.py to recursively strip enumDescriptions from all tool definitions before sending the request to the Google backend.

2. TypeError in Anthropic Streaming Wrapper

Issue Description

Claude Code sessions triggered a server-side 500 error immediately upon starting a stream.

Root Cause

A parameter mismatch in the /v1/messages endpoint within main.py. The code was incorrectly passing the FastAPI Request object into a generator that expected an openai_stream object, causing an iteration failure.

Error Message

Error in Anthropic streaming wrapper: 'async for' requires an object with __aiter__ method, got Request
TypeError: Object of type async_generator is not JSON serializable

Resolution

Refactored the endpoints in main.py to delegate request handling to the RotatingClient.anthropic_messages() method, which handles the parameter passing and stream wrapping correctly.

3. Pydantic 422 "Unprocessable Content"

Issue Description

Requests containing newer Anthropic beta headers (like interleaved-thinking-2025-05-14) were rejected with a 422 error before reaching the provider logic.

Root Cause

The Pydantic models for AnthropicMessagesRequest were too strict and did not allow extra fields. Claude Code frequently adds experimental fields to its JSON payload.

Resolution

Updated models.py to set model_config = ConfigDict(extra="allow") for all Anthropic request objects, ensuring future-proof compatibility with evolving Anthropic SDKs.

b3nw · 2026-01-09T04:45:36Z

Issue: Hardcoded Background Model Probes in Claude Code

Description

When using Claude Code (v2.1.2+) with the LLM-API-Key-Proxy, the client performs background "probes" for specific model names that are not explicitly requested by the user in the --model flag.

These probes appear in the proxy logs as requests for models such as:

claude-haiku-4-5-20251001
gemini-claude-opus-4-5-thinking
claude-3-5-sonnet-20241022

Root Cause

These model names are hardcoded into the Claude Code binary for two primary purposes:

Capability Discovery: Checking if the endpoint supports specific beta features (e.g., interleaved-thinking, computer-use).
Task Scaling: Attempting to use a cheaper "Haiku-class" model for background indexing or file scanning tasks to optimize performance and cost.

Because these requests are generated internally by the client, they lack the antigravity/ provider prefix and use futuristic version strings that do not exist in the proxy's default mapping tables, resulting in 404 Not Found errors from the backend.

Impact

Endless Thinking: Claude Code may hang or show a "thinking" spinner indefinitely while it retries these failed background probes.
Log Noise: Proxy logs are cluttered with 400/404 errors for models that are not configured.
Session Instability: Critical background tasks like code indexing may fail to complete.

Recommended Workaround

Since this behavior is hardcoded in the client and cannot be disabled via --no-stats or environment variables, the recommended solution is to provide a "safety net" mapping in the Proxy's configuration.

Add the following to your LLM-API-Key-Proxy/.env file to funnel these probes into the models you actually have credentials for:

# Map prefix-less background probes to the Antigravity provider
ANTIGRAVITY_MODELS='{
  "claude-haiku-4-5-20251001": {"id": "claude-sonnet-4-5"},
  "gemini-claude-opus-4-5-thinking": {"id": "claude-opus-4-5"},
  "claude-3-5-sonnet-20241022": {"id": "claude-sonnet-4-5"}
}'

Status

Client: Claude Code v2.1.2+
Proxy Version: PR refactor: Move Anthropic translation layer to library #47 (Anthropic Compatibility)

FammasMaz and others added 18 commits December 19, 2025 15:03

fix(anthropic): properly close all content blocks in streaming wrapper

bea58d4

Track each tool_use block index separately and emit content_block_stop for all blocks (thinking, text, and each tool_use) when stream ends. Fixes Claude Code stopping mid-action due to malformed streaming events.

fix(anthropic): add missing uuid import for /v1/messages endpoint

a615953

fix(anthropic): add missing JSONResponse import for non-streaming res…

32c3f21

…ponses

feat: add /context endpoint for anthropic routes

1227d8c

Signed-off-by: Moeeze Hassan <[email protected]>

Revert "feat(openai): auto-enable full thinking budget for Opus"

72b612c

This reverts commit e80645e.

Revert "fix(anthropic): always set custom_reasoning_budget when think…

aef492b

…ing is enabled" This reverts commit 2ee549d.

mirrobot-agent bot mentioned this pull request Dec 19, 2025

Anthropic endpoint for claude code #45

Merged

mirrobot-agent bot commented Dec 19, 2025

View reviewed changes

Mirrowel added enhancement New feature or request Agent Monitored Monitored for AI Agent to review PR's and commits Priority labels Dec 19, 2025

Mirrowel closed this Jan 15, 2026

Mirrowel deleted the refactor/anthropic-library-integration branch January 16, 2026 02:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

refactor: Move Anthropic translation layer to library #47

refactor: Move Anthropic translation layer to library #47

Uh oh!

mirrobot-agent bot commented Dec 19, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

mirrobot-agent bot commented Dec 19, 2025

Uh oh!

mirrobot-agent bot left a comment

Uh oh!

mirrobot-agent bot Dec 19, 2025

Uh oh!

mirrobot-agent bot Dec 19, 2025

Uh oh!

mirrobot-agent bot Dec 19, 2025

Uh oh!

b3nw commented Jan 9, 2026 •

edited

Loading

Uh oh!

b3nw commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

refactor: Move Anthropic translation layer to library #47

refactor: Move Anthropic translation layer to library #47

Uh oh!

Conversation

mirrobot-agent bot commented Dec 19, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

New Library Module: rotator_library/anthropic_compat/

Updated rotator_library/client.py

Simplified proxy_app/main.py

Benefits

Files Changed

Uh oh!

mirrobot-agent bot commented Dec 19, 2025

Uh oh!

mirrobot-agent bot left a comment

Choose a reason for hiding this comment

Self-Review Assessment

Architectural Reflections

Key Fixes I Should Make

Things That Actually Look Good

Uh oh!

mirrobot-agent bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

mirrobot-agent bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

mirrobot-agent bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

b3nw commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

BUG Report: Anthropic Compatibility & Claude Code Integration

Executive Summary

Resolved Issues:

1. Antigravity API 400 Error (Tool Schemas)

Issue Description

Root Cause

Error Message

Resolution

2. TypeError in Anthropic Streaming Wrapper

Issue Description

Root Cause

Error Message

Resolution

3. Pydantic 422 "Unprocessable Content"

Issue Description

Root Cause

Resolution

Uh oh!

b3nw commented Jan 9, 2026

Issue: Hardcoded Background Model Probes in Claude Code

Description

Root Cause

Impact

Recommended Workaround

Status

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mirrobot-agent bot commented Dec 19, 2025 •

edited by ellipsis-dev bot

Loading

New Library Module: `rotator_library/anthropic_compat/`

Updated `rotator_library/client.py`

Simplified `proxy_app/main.py`

b3nw commented Jan 9, 2026 •

edited

Loading