Fix Anthropic Claude integration via LiteLLM model routing by Neelkanthsahu02 · Pull Request #25 · VRSEN/OpenSwarm

Neelkanthsahu02 · 2026-05-09T09:18:06Z

The previous onboarding default DEFAULT_MODEL=litellm/claude-sonnet-4-6 was stripped to a bare 'claude-sonnet-4-6' before being passed to LiteLLM, which has no provider prefix to route on and 404s.

config._resolve now strips the optional 'litellm/' prefix and auto-prepends 'anthropic/' (or 'gemini/') when the resulting name is a bare model id, matching the convention already used in slides_agent/tools/InsertNewSlides.py.
is_openai_provider correctly classifies bare claude-* / gemini-* names as non-OpenAI so reasoning settings aren't applied to them.
onboard.py writes the explicit 'litellm/anthropic/claude-sonnet-4-6' default for new Anthropic setups; legacy .env files keep working.
.env.example documents that ANTHROPIC_API_KEY accepts both standard API keys (sk-ant-api...) and Claude Pro/Max/Code subscription OAuth tokens (sk-ant-oat01-...) generated via 'claude setup-token'.

The previous onboarding default DEFAULT_MODEL=litellm/claude-sonnet-4-6 was stripped to a bare 'claude-sonnet-4-6' before being passed to LiteLLM, which has no provider prefix to route on and 404s. - config._resolve now strips the optional 'litellm/' prefix and auto-prepends 'anthropic/' (or 'gemini/') when the resulting name is a bare model id, matching the convention already used in slides_agent/tools/InsertNewSlides.py. - is_openai_provider correctly classifies bare claude-* / gemini-* names as non-OpenAI so reasoning settings aren't applied to them. - onboard.py writes the explicit 'litellm/anthropic/claude-sonnet-4-6' default for new Anthropic setups; legacy .env files keep working. - .env.example documents that ANTHROPIC_API_KEY accepts both standard API keys (sk-ant-api...) and Claude Pro/Max/Code subscription OAuth tokens (sk-ant-oat01-...) generated via 'claude setup-token'.

Adds first-class support for routing every model call through a local OpenAI-compatible proxy (claude-code-router, 9router, ccproxy, etc.) that authenticates against a Claude Pro/Max/Claude Code subscription. Set OPENAI_BASE_URL to the router URL and pick whatever model name the router maps to Claude. - config._resolve: when OPENAI_BASE_URL is set, return the model string as-is so the OpenAI client hits the router; do not auto-route bare claude-* names through LiteLLM (LiteLLM ignores OPENAI_BASE_URL). - config.is_openai_provider: return False whenever OPENAI_BASE_URL is set, so OpenAI-only ModelSettings (reasoning summaries) are not sent to the router/Claude. - swarm.py: disable tracing in router mode — the dummy OPENAI_API_KEY the router accepts would be rejected by OpenAI's real tracing endpoint. - .env.example: document the router workflow as the supported way to use a Claude subscription, and remove the previous claim that ANTHROPIC_API_KEY accepts sk-ant-oat01-... OAuth tokens directly (LiteLLM does not implement the OAuth header protocol, so that path silently fails).

InsertNewSlides and ModifySlide hardcoded gpt-5.3-codex (OpenAI) and 'anthropic/claude-sonnet-4-6' via direct LiteLLM, bypassing the local router. Bare 'gpt-5.3-codex' isn't exposed by routers like 9router (it's served as 'cx/gpt-5.3-codex'), and direct LiteLLM/Anthropic calls don't go through subscription auth. When OPENAI_BASE_URL is set: - Use the agency's DEFAULT_MODEL via OpenAIResponsesModel against the router-bound AsyncOpenAI() client. Skip the ANTHROPIC_API_KEY branch and the 'gpt-5.3-codex' fallback entirely. - Disable reasoning summary / verbosity ModelSettings — these are OpenAI-Responses-only options that the router/Claude reject. Non-router behavior is unchanged.

Reported: agency-swarm rejected DEFAULT_MODEL=cc/claude-opus-4-7 with 'Unknown prefix: cc' because anything-with-a-slash gets parsed as '<provider>/<model>' and 'cc' is not a registered provider. Routers like 9router namespace models with their own prefixes (cc/..., cx/...). To prevent agency-swarm from interpreting that prefix at all, hand it a pre-built OpenAIResponsesModel instance bound to the router-aware AsyncOpenAI() client. The SDK then uses the instance directly and forwards 'cc/claude-opus-4-7' verbatim to the router.

Reported: 'validation error for ModelResponse output: Input should be a valid list, input_value=None'. 9router's /v1/responses endpoint does not return Responses-API-shaped output items, so the SDK gets None where it expects a list and the run fails after the user-visible message has already streamed in. Chat Completions has been a stable, universally-implemented OpenAI API for years; routers and proxies that wrap subscription auth implement it correctly. Switching to OpenAIChatCompletionsModel for router mode in config.py — and reusing the resolver in the slides agent helpers — fixes the validation error without changing behaviour for direct OpenAI/LiteLLM paths.

Reported: 'Hosted tools are not supported with the ChatCompletions API. Got tool type: WebSearchTool'. Hosted tools (WebSearchTool, FileSearchTool, HostedMCPTool, ImageGenerationTool, CodeInterpreterTool, ComputerTool) are server-side OpenAI Responses-API features that only run against api.openai.com. With router mode using Chat Completions the SDK rejects them at agent construction. Add config.filter_hosted_tools() that no-ops when not in router mode and strips known hosted tool instances when it is. Apply it to the five agents that include WebSearchTool: deep_research, docs_agent, data_analyst_agent, virtual_assistant, slides_agent. Web search is therefore unavailable in router mode for now. Users who need search can either run with direct OpenAI auth, or wire a custom search tool (the repo already includes ScholarSearch / ProductSearch backed by SEARCH_API_KEY).

New ClaudeOAuthModel — a custom OpenAI Agents SDK Model implementation that calls api.anthropic.com via the official anthropic Python SDK with a Claude Code subscription OAuth token (sk-ant-oat01-...). This gives subscription billing without depending on a local router proxy. Auth: - auth_token=<token> on AsyncAnthropic → Authorization: Bearer header. - default_headers={'anthropic-beta': 'oauth-2025-04-20'}. - System prompt prefixed with 'You are Claude Code, Anthropic's official CLI for Claude.' as required by the subscription path. Wiring: - config.is_oauth_mode(): true when CLAUDE_CODE_OAUTH_TOKEN is set or ANTHROPIC_API_KEY starts with sk-ant-oat. - config._resolve takes the OAuth branch ahead of router/LiteLLM, strips any 'litellm/'/'cc/'/'cx/' prefix from DEFAULT_MODEL, and hands a ClaudeOAuthModel instance to each agent. - Hosted tools (WebSearchTool etc.) are also stripped in OAuth mode via filter_hosted_tools, since Anthropic does not run OpenAI's server-side tools. - requirements.txt picks up anthropic>=0.40.0. - .env.example documents the OAuth path with model ids and token generation instructions (claude setup-token). Translation layer (in claude_oauth_model.py): - OpenAI Responses-API input items → Anthropic messages with text/tool_use/tool_result blocks. - OpenAI Tool/Handoff schemas → Anthropic tools (name, description, input_schema). - Anthropic content blocks → OpenAI Responses-style output items (message + function_call) so the rest of agency-swarm consumes the response unchanged. Streaming is best-effort: stream_response delegates to get_response. Real per-token streaming requires emitting the right TResponseStreamEvent variants and is left for a future iteration — the TUI still renders the final assistant message correctly. Untested in CI; needs iteration on first run against a real OAuth token. Existing router and OpenAI paths are unchanged.

… crash Reported: "'async for' requires an object with __aiter__ method, got coroutine" when the TUI consumed a ClaudeOAuthModel response. Cause: stream_response was 'async def' that returned a generator object, so calling it produced a coroutine, not an iterator. The SDK's `async for event in stream_response(...)` then failed. Fix: - stream_response is now a regular def that returns the async generator method `_stream_iter` un-awaited. The SDK iterates it directly. - _stream_iter drives anthropic.messages.stream() and emits ResponseTextDeltaEvent for each Anthropic text_delta plus a final ResponseCompletedEvent carrying the assembled ModelResponse. We use Pydantic's model_construct() so we don't have to populate every field of Response/ResponseCompletedEvent. This restores live text rendering in the TUI for OAuth mode. Tool-call streaming is intentionally minimal — the final completed event carries the full output items list, which the SDK uses for function-call dispatch.

Investigated openswarm-ai/openswarm to see how they let users run on a Claude subscription without dealing with OAuth tokens. Their backend (apps/settings/credentials.py:get_anthropic_client) just constructs the Anthropic Python SDK with: anthropic.AsyncAnthropic( api_key="9router", base_url="http://localhost:20128", ) The router does the OAuth dance internally; the SDK sees a normal Anthropic Messages endpoint. No Bearer header, no oauth-2025-04-20 beta, no claude setup-token needed. Refactor: - ClaudeOAuthModel → ClaudeRouterModel (alias kept for compat). Constructor takes ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY (default literal "9router") and points the anthropic SDK at the router. Drops auth_token, drops the anthropic-beta header, drops the Claude-Code system-prompt preamble. - config.is_oauth_mode now triggers on ANTHROPIC_BASE_URL (the canonical anthropic SDK env var) — keeps CLAUDE_CODE_OAUTH_TOKEN and sk-ant-oat... detection as legacy aliases. - config._resolve passes the model id through unchanged so the router can route by its own prefix (cc/, cx/, gc/). - .env.example documents the simpler setup: ANTHROPIC_BASE_URL=http://localhost:20128 ANTHROPIC_API_KEY=9router DEFAULT_MODEL=cc/claude-opus-4-7 This should resolve the auth failures the previous OAuth path hit.

Replaces the Anthropic-SDK-with-OAuth-tokens approach (which kept hitting auth issues) with the same pattern openswarm-ai uses: spawn the local `claude` CLI via claude_agent_sdk and inherit its stored subscription credentials. claude_oauth_model.py rewritten as ClaudeAgentSDKModel: - get_response/stream_response build an in-process SDK MCP server exposing the agent's tools as MCP tools, then call claude_agent_sdk.query() with that server registered. - Tool calls execute INSIDE one query() invocation — Claude calls our wrapped MCP tool, the SDK runs our wrapper which dispatches back to the original agency-swarm Tool.on_invoke_tool, the result is returned to Claude, the loop continues until Claude emits a final assistant message. - Built-in Claude Code tools (Bash, WebSearch, Edit, ...) are disabled by default for behavior parity. CLAUDE_USE_BUILTIN_TOOLS=1 re-enables them. - Streaming yields ResponseTextDeltaEvent per text chunk and a final ResponseCompletedEvent — same wire shape used elsewhere in agency-swarm. - Past tool calls in the input history are flattened to text; Claude can't replay tool_use blocks it didn't generate this run. config.is_oauth_mode now also auto-detects: if claude-agent-sdk is installed AND DEFAULT_MODEL looks like a Claude id AND no other backend env is set, route through the SDK. Existing ANTHROPIC_BASE_URL / sk-ant-oat / CLAUDE_CODE_OAUTH_TOKEN paths still trigger it. CLAUDE_USE_AGENT_SDK=1 is an explicit override. requirements.txt picks up claude-agent-sdk>=0.1.0. .env.example documents the SDK-CLI workflow as the recommended subscription path; remote ANTHROPIC_BASE_URL stays as an alternative. Backwards-compat: ClaudeOAuthModel and ClaudeRouterModel are kept as aliases for ClaudeAgentSDKModel.

A focused agency-swarm setup tailored for YouTube content workflows, backed by ClaudeAgentSDKModel for Claude subscription auth. v1 scope is intentionally narrow: idea generation and scripting only — no thumbnails, video production, or scheduling yet. Agents (each ~30 lines): - Orchestrator: routes the request, holds the user-facing voice, assembles deliverables. Never produces them itself. - Researcher: gets Claude Code's built-in WebSearch + WebFetch (use_builtin_tools=True passed to its model adapter). Returns a structured research brief — top performers, trending angles, audience pain points, sources. - Ideator: turnsthe brief into 8 ranked ideas with hooks, target viewers, and brief-grounded "why this works" citations. Toolless. - Scripter: takes one idea and writes a complete ready-to-record script (hook, intro, body with B-roll cues, CTA, outro). One script per call. Toolless. Layout: - youtube_swarm/instructions/*.md prompts (orchestrator/researcher/ ideator/scripter), each with tight format/voice rules so v1 output is deterministic. - youtube_swarm/agents.py agent factories. - youtube_swarm/__init__.py re-exports. - youtube_swarm.py entry point: `python youtube_swarm.py` builds the Agency with orchestrator-as-entry communication flows and launches the agency-swarm TUI. Per-agent toolset switching: - ClaudeAgentSDKModel.__init__ now accepts use_builtin_tools so the Researcher can enable built-in tools (WebSearch/WebFetch) while the others stay toolless. Default behaviour for the rest of the codebase is unchanged. Doesn't touch the existing openswarm orchestrator/swarm.py — both entry points coexist.

claude added 11 commits May 9, 2026 09:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Anthropic Claude integration via LiteLLM model routing#25

Fix Anthropic Claude integration via LiteLLM model routing#25
Neelkanthsahu02 wants to merge 11 commits into
VRSEN:mainfrom
Neelkanthsahu02:claude/fix-claude-api-integration-1SL0O

Neelkanthsahu02 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Neelkanthsahu02 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants