Fix Anthropic Claude integration via LiteLLM model routing#25
Open
Neelkanthsahu02 wants to merge 11 commits into
Open
Fix Anthropic Claude integration via LiteLLM model routing#25Neelkanthsahu02 wants to merge 11 commits into
Neelkanthsahu02 wants to merge 11 commits into
Conversation
The previous onboarding default DEFAULT_MODEL=litellm/claude-sonnet-4-6 was stripped to a bare 'claude-sonnet-4-6' before being passed to LiteLLM, which has no provider prefix to route on and 404s. - config._resolve now strips the optional 'litellm/' prefix and auto-prepends 'anthropic/' (or 'gemini/') when the resulting name is a bare model id, matching the convention already used in slides_agent/tools/InsertNewSlides.py. - is_openai_provider correctly classifies bare claude-* / gemini-* names as non-OpenAI so reasoning settings aren't applied to them. - onboard.py writes the explicit 'litellm/anthropic/claude-sonnet-4-6' default for new Anthropic setups; legacy .env files keep working. - .env.example documents that ANTHROPIC_API_KEY accepts both standard API keys (sk-ant-api...) and Claude Pro/Max/Code subscription OAuth tokens (sk-ant-oat01-...) generated via 'claude setup-token'.
Adds first-class support for routing every model call through a local OpenAI-compatible proxy (claude-code-router, 9router, ccproxy, etc.) that authenticates against a Claude Pro/Max/Claude Code subscription. Set OPENAI_BASE_URL to the router URL and pick whatever model name the router maps to Claude. - config._resolve: when OPENAI_BASE_URL is set, return the model string as-is so the OpenAI client hits the router; do not auto-route bare claude-* names through LiteLLM (LiteLLM ignores OPENAI_BASE_URL). - config.is_openai_provider: return False whenever OPENAI_BASE_URL is set, so OpenAI-only ModelSettings (reasoning summaries) are not sent to the router/Claude. - swarm.py: disable tracing in router mode — the dummy OPENAI_API_KEY the router accepts would be rejected by OpenAI's real tracing endpoint. - .env.example: document the router workflow as the supported way to use a Claude subscription, and remove the previous claim that ANTHROPIC_API_KEY accepts sk-ant-oat01-... OAuth tokens directly (LiteLLM does not implement the OAuth header protocol, so that path silently fails).
InsertNewSlides and ModifySlide hardcoded gpt-5.3-codex (OpenAI) and 'anthropic/claude-sonnet-4-6' via direct LiteLLM, bypassing the local router. Bare 'gpt-5.3-codex' isn't exposed by routers like 9router (it's served as 'cx/gpt-5.3-codex'), and direct LiteLLM/Anthropic calls don't go through subscription auth. When OPENAI_BASE_URL is set: - Use the agency's DEFAULT_MODEL via OpenAIResponsesModel against the router-bound AsyncOpenAI() client. Skip the ANTHROPIC_API_KEY branch and the 'gpt-5.3-codex' fallback entirely. - Disable reasoning summary / verbosity ModelSettings — these are OpenAI-Responses-only options that the router/Claude reject. Non-router behavior is unchanged.
Reported: agency-swarm rejected DEFAULT_MODEL=cc/claude-opus-4-7 with 'Unknown prefix: cc' because anything-with-a-slash gets parsed as '<provider>/<model>' and 'cc' is not a registered provider. Routers like 9router namespace models with their own prefixes (cc/..., cx/...). To prevent agency-swarm from interpreting that prefix at all, hand it a pre-built OpenAIResponsesModel instance bound to the router-aware AsyncOpenAI() client. The SDK then uses the instance directly and forwards 'cc/claude-opus-4-7' verbatim to the router.
Reported: 'validation error for ModelResponse output: Input should be a valid list, input_value=None'. 9router's /v1/responses endpoint does not return Responses-API-shaped output items, so the SDK gets None where it expects a list and the run fails after the user-visible message has already streamed in. Chat Completions has been a stable, universally-implemented OpenAI API for years; routers and proxies that wrap subscription auth implement it correctly. Switching to OpenAIChatCompletionsModel for router mode in config.py — and reusing the resolver in the slides agent helpers — fixes the validation error without changing behaviour for direct OpenAI/LiteLLM paths.
Reported: 'Hosted tools are not supported with the ChatCompletions API. Got tool type: WebSearchTool'. Hosted tools (WebSearchTool, FileSearchTool, HostedMCPTool, ImageGenerationTool, CodeInterpreterTool, ComputerTool) are server-side OpenAI Responses-API features that only run against api.openai.com. With router mode using Chat Completions the SDK rejects them at agent construction. Add config.filter_hosted_tools() that no-ops when not in router mode and strips known hosted tool instances when it is. Apply it to the five agents that include WebSearchTool: deep_research, docs_agent, data_analyst_agent, virtual_assistant, slides_agent. Web search is therefore unavailable in router mode for now. Users who need search can either run with direct OpenAI auth, or wire a custom search tool (the repo already includes ScholarSearch / ProductSearch backed by SEARCH_API_KEY).
New ClaudeOAuthModel — a custom OpenAI Agents SDK Model implementation
that calls api.anthropic.com via the official anthropic Python SDK
with a Claude Code subscription OAuth token (sk-ant-oat01-...). This
gives subscription billing without depending on a local router proxy.
Auth:
- auth_token=<token> on AsyncAnthropic → Authorization: Bearer header.
- default_headers={'anthropic-beta': 'oauth-2025-04-20'}.
- System prompt prefixed with 'You are Claude Code, Anthropic's
official CLI for Claude.' as required by the subscription path.
Wiring:
- config.is_oauth_mode(): true when CLAUDE_CODE_OAUTH_TOKEN is set
or ANTHROPIC_API_KEY starts with sk-ant-oat.
- config._resolve takes the OAuth branch ahead of router/LiteLLM,
strips any 'litellm/'/'cc/'/'cx/' prefix from DEFAULT_MODEL, and
hands a ClaudeOAuthModel instance to each agent.
- Hosted tools (WebSearchTool etc.) are also stripped in OAuth mode
via filter_hosted_tools, since Anthropic does not run OpenAI's
server-side tools.
- requirements.txt picks up anthropic>=0.40.0.
- .env.example documents the OAuth path with model ids and token
generation instructions (claude setup-token).
Translation layer (in claude_oauth_model.py):
- OpenAI Responses-API input items → Anthropic messages with
text/tool_use/tool_result blocks.
- OpenAI Tool/Handoff schemas → Anthropic tools (name, description,
input_schema).
- Anthropic content blocks → OpenAI Responses-style output items
(message + function_call) so the rest of agency-swarm consumes
the response unchanged.
Streaming is best-effort: stream_response delegates to get_response.
Real per-token streaming requires emitting the right
TResponseStreamEvent variants and is left for a future iteration —
the TUI still renders the final assistant message correctly.
Untested in CI; needs iteration on first run against a real OAuth
token. Existing router and OpenAI paths are unchanged.
… crash Reported: "'async for' requires an object with __aiter__ method, got coroutine" when the TUI consumed a ClaudeOAuthModel response. Cause: stream_response was 'async def' that returned a generator object, so calling it produced a coroutine, not an iterator. The SDK's `async for event in stream_response(...)` then failed. Fix: - stream_response is now a regular def that returns the async generator method `_stream_iter` un-awaited. The SDK iterates it directly. - _stream_iter drives anthropic.messages.stream() and emits ResponseTextDeltaEvent for each Anthropic text_delta plus a final ResponseCompletedEvent carrying the assembled ModelResponse. We use Pydantic's model_construct() so we don't have to populate every field of Response/ResponseCompletedEvent. This restores live text rendering in the TUI for OAuth mode. Tool-call streaming is intentionally minimal — the final completed event carries the full output items list, which the SDK uses for function-call dispatch.
Investigated openswarm-ai/openswarm to see how they let users run
on a Claude subscription without dealing with OAuth tokens. Their
backend (apps/settings/credentials.py:get_anthropic_client) just
constructs the Anthropic Python SDK with:
anthropic.AsyncAnthropic(
api_key="9router",
base_url="http://localhost:20128",
)
The router does the OAuth dance internally; the SDK sees a normal
Anthropic Messages endpoint. No Bearer header, no oauth-2025-04-20
beta, no claude setup-token needed.
Refactor:
- ClaudeOAuthModel → ClaudeRouterModel (alias kept for compat).
Constructor takes ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY (default
literal "9router") and points the anthropic SDK at the router.
Drops auth_token, drops the anthropic-beta header, drops the
Claude-Code system-prompt preamble.
- config.is_oauth_mode now triggers on ANTHROPIC_BASE_URL (the
canonical anthropic SDK env var) — keeps CLAUDE_CODE_OAUTH_TOKEN
and sk-ant-oat... detection as legacy aliases.
- config._resolve passes the model id through unchanged so the router
can route by its own prefix (cc/, cx/, gc/).
- .env.example documents the simpler setup:
ANTHROPIC_BASE_URL=http://localhost:20128
ANTHROPIC_API_KEY=9router
DEFAULT_MODEL=cc/claude-opus-4-7
This should resolve the auth failures the previous OAuth path hit.
Replaces the Anthropic-SDK-with-OAuth-tokens approach (which kept hitting auth issues) with the same pattern openswarm-ai uses: spawn the local `claude` CLI via claude_agent_sdk and inherit its stored subscription credentials. claude_oauth_model.py rewritten as ClaudeAgentSDKModel: - get_response/stream_response build an in-process SDK MCP server exposing the agent's tools as MCP tools, then call claude_agent_sdk.query() with that server registered. - Tool calls execute INSIDE one query() invocation — Claude calls our wrapped MCP tool, the SDK runs our wrapper which dispatches back to the original agency-swarm Tool.on_invoke_tool, the result is returned to Claude, the loop continues until Claude emits a final assistant message. - Built-in Claude Code tools (Bash, WebSearch, Edit, ...) are disabled by default for behavior parity. CLAUDE_USE_BUILTIN_TOOLS=1 re-enables them. - Streaming yields ResponseTextDeltaEvent per text chunk and a final ResponseCompletedEvent — same wire shape used elsewhere in agency-swarm. - Past tool calls in the input history are flattened to text; Claude can't replay tool_use blocks it didn't generate this run. config.is_oauth_mode now also auto-detects: if claude-agent-sdk is installed AND DEFAULT_MODEL looks like a Claude id AND no other backend env is set, route through the SDK. Existing ANTHROPIC_BASE_URL / sk-ant-oat / CLAUDE_CODE_OAUTH_TOKEN paths still trigger it. CLAUDE_USE_AGENT_SDK=1 is an explicit override. requirements.txt picks up claude-agent-sdk>=0.1.0. .env.example documents the SDK-CLI workflow as the recommended subscription path; remote ANTHROPIC_BASE_URL stays as an alternative. Backwards-compat: ClaudeOAuthModel and ClaudeRouterModel are kept as aliases for ClaudeAgentSDKModel.
A focused agency-swarm setup tailored for YouTube content workflows, backed by ClaudeAgentSDKModel for Claude subscription auth. v1 scope is intentionally narrow: idea generation and scripting only — no thumbnails, video production, or scheduling yet. Agents (each ~30 lines): - Orchestrator: routes the request, holds the user-facing voice, assembles deliverables. Never produces them itself. - Researcher: gets Claude Code's built-in WebSearch + WebFetch (use_builtin_tools=True passed to its model adapter). Returns a structured research brief — top performers, trending angles, audience pain points, sources. - Ideator: turnsthe brief into 8 ranked ideas with hooks, target viewers, and brief-grounded "why this works" citations. Toolless. - Scripter: takes one idea and writes a complete ready-to-record script (hook, intro, body with B-roll cues, CTA, outro). One script per call. Toolless. Layout: - youtube_swarm/instructions/*.md prompts (orchestrator/researcher/ ideator/scripter), each with tight format/voice rules so v1 output is deterministic. - youtube_swarm/agents.py agent factories. - youtube_swarm/__init__.py re-exports. - youtube_swarm.py entry point: `python youtube_swarm.py` builds the Agency with orchestrator-as-entry communication flows and launches the agency-swarm TUI. Per-agent toolset switching: - ClaudeAgentSDKModel.__init__ now accepts use_builtin_tools so the Researcher can enable built-in tools (WebSearch/WebFetch) while the others stay toolless. Default behaviour for the rest of the codebase is unchanged. Doesn't touch the existing openswarm orchestrator/swarm.py — both entry points coexist.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The previous onboarding default DEFAULT_MODEL=litellm/claude-sonnet-4-6 was stripped to a bare 'claude-sonnet-4-6' before being passed to LiteLLM, which has no provider prefix to route on and 404s.