Skip to content

Add Azure OpenAI, Azure AI Foundry, Ollama, and OpenAI-compatible providers#26

Draft
nyimbi wants to merge 1 commit into
VRSEN:mainfrom
nyimbi:feat/azure-ollama-providers
Draft

Add Azure OpenAI, Azure AI Foundry, Ollama, and OpenAI-compatible providers#26
nyimbi wants to merge 1 commit into
VRSEN:mainfrom
nyimbi:feat/azure-ollama-providers

Conversation

@nyimbi
Copy link
Copy Markdown

@nyimbi nyimbi commented May 9, 2026

Summary

Extends OpenSwarm's provider matrix from three (OpenAI / Anthropic / Google) to seven, plus a runtime provider switch reachable from the orchestrator.

What's new

Providers

  • Azure OpenAI Service — your own gpt-* deployment (azure/<deployment>).
  • Azure AI Foundry — catalog of Anthropic Claude (Opus/Sonnet), Llama, Mistral, DeepSeek, etc. via azure_ai/<model>. For Anthropic models, the base URL ends with /anthropic.
  • Ollama (local) — keyless, defaults to http://localhost:11434. OLLAMA_API_BASE is threaded explicitly.
  • OpenAI-compatible — generic openai_compat/<model> route covering Ollama Cloud, Groq, Together AI, Mistral La Plateforme, OpenRouter, vLLM. Uses dedicated OPENAI_COMPAT_* env vars so a real OPENAI_API_KEY kept for fallback is never overwritten.

Runtime switching

  • SwitchProvider tool registered exclusively on the orchestrator (lives under orchestrator/tools/ rather than shared_tools/ so specialists can't import it). Users say "switch to ollama llama3.1" or "switch to azure_ai claude-opus-4-1"; the tool validates credentials, writes DEFAULT_MODEL to .env atomically, and signals run_utils.main() to rebuild the agency on next TUI exit.
  • The orchestrator's "router only" contract is preserved with one documented carve-out (provider switching is administrative, not a specialist task).
  • The FastAPI server in server.py doesn't read the restart signal — switching from the API surface is a documented no-op.

Architecture

  • Single source of truth: config.PROVIDER_REGISTRY maps slug → (prefix, required_env). Both the wizard and the SwitchProvider tool derive from this table — adding an 8th provider is one entry plus an optional UI entry in onboard.PROVIDERS.
  • get_active_provider() uses longest-prefix-wins lookup so azure_ai/ matches before azure/ would.
  • The anthropic prefix is litellm/claude (not litellm/) so litellm/cohere/... and other LiteLLM third-party models aren't misclassified as Anthropic.

Hardening

  • SSRF defense: SwitchProvider refuses any openai_compat switch where OPENAI_COMPAT_API_BASE isn't an https:// URL with a real hostname. Closes the prompt-injection chain where an attacker pre-positions the base URL and induces a switch, redirecting all subsequent LLM traffic with bearer tokens and conversation history.
  • Input validation: model field requires alphanumeric start + [\w.:-/]. Blocks newline injection into .env, shell metacharacters, and traversal-style ids.
  • Atomic .env write: the restart flag is touched before the rewrite, so a crash in any window leaves recoverable state. The rewrite uses set_key on a temp copy then os.replace.
  • config._resolve() raises RuntimeError when openai_compat is configured without the base URL, instead of constructing a LitellmModel with None credentials that would fail cryptically at first call.
  • The except clause in _resolve catches only ImportError; TypeError propagates so misconfigured kwargs surface immediately rather than degrading to a bare model string.
  • Restart flag files live in a user-scoped tempdir (mode 0o700) so a co-tenant on /tmp can't force a spurious restart.

Tests

36 pytest cases cover provider validation, SSRF guard, input validation, atomic write recovery, missing-credential errors, prefix classification (incl. longest-prefix-wins for azure_ai/ vs azure/), openai_compat unwrap to openai/<model>, RuntimeError on missing API_BASE, ImportError graceful degradation, TypeError propagation, dotenv quoting round-trips, OSError on flag touch refuses switch, and the wizard's PROVIDERS data shape contract.

Test scaffolding stubs agency_swarm and openai.types.shared in sys.modules so the suite runs from a bare Python install with just pytest + python-dotenv + pydantic.

$ pip install pytest && pytest
========================== 36 passed in 0.33s =============================

Documentation

  • README.md updated with the 7-provider list, runtime switch description, and an "Upgrading from an earlier version" note.
  • AGENTS.md documents the orchestrator/tools/ convention and the PROVIDER_REGISTRY contract.
  • orchestrator/instructions.md documents the administrative carve-out.
  • .env.example documents every new env var with vendor URL examples for openai_compat (Groq, Together, Mistral, OpenRouter; Ollama Cloud points at https://docs.ollama.com since the canonical endpoint can change).

Test plan

  • All 36 unit tests pass on a bare Python install
  • Pre-PR review by independent code-review-expert and verifier agents (findings addressed)
  • Pending end-to-end verification with real keys: Azure OpenAI deployment, Azure AI Foundry Claude, Ollama Cloud, Groq — these require accounts I don't have here. Happy to coordinate with anyone who has them.
  • Provider switch signal mechanism (flag + run_utils restart loop) verified manually with OPENSWARM_SWITCH_FLAG mocked

Backwards compatibility

Nothing breaks. Bare DEFAULT_MODEL=gpt-5.2 strings still route to OpenAI; litellm/<model> strings still route through LiteLLM. onboard.PROVIDERS shape changed from single env_key to keys: [...] (refactored to support multi-credential providers like Azure), but no external consumer of that dict was found in the repo. Existing .env files keep working.

Notes for review

  • Private-IP SSRF guard: deliberately permissive (allows https://127.0.0.1/v1) because OpenSwarm targets local development and self-hosted vLLM / Ollama Cloud / etc. are legitimate use cases. Open to changing this if you'd prefer a stricter default with an opt-in env var for local endpoints.
  • server.py blind spot: documented rather than fixed. Adding a watchdog/reload mechanism for the FastAPI surface is a bigger change worth a separate PR.

🤖 Generated with Claude Code

…viders

Extends OpenSwarm's provider matrix from three (OpenAI / Anthropic / Google)
to seven, plus a runtime provider switch for the orchestrator.

New providers
- Azure OpenAI Service: your own gpt-* deployment on Azure (LiteLLM
  prefix `azure/`, env: AZURE_API_KEY + AZURE_API_BASE + AZURE_API_VERSION).
- Azure AI Foundry: catalog of non-OpenAI models on Azure including
  Anthropic Claude (Opus / Sonnet), Llama, Mistral, DeepSeek (LiteLLM
  prefix `azure_ai/`, env: AZURE_AI_API_KEY + AZURE_AI_API_BASE). For
  Anthropic models the base URL must end with `/anthropic`.
- Ollama (local): no key required, defaults to http://localhost:11434.
  OLLAMA_API_BASE is threaded explicitly into LitellmModel.
- OpenAI-compatible: generic route for any vendor with an OpenAI-shaped
  API — Ollama Cloud, Groq, Together AI, Mistral La Plateforme,
  OpenRouter, vLLM-based deployments. Uses dedicated OPENAI_COMPAT_*
  env vars so a real OPENAI_API_KEY kept for fallback is never
  overwritten. Only the base URL is required; key is optional for
  keyless local endpoints.

Provider routing
- Single source of truth: config.PROVIDER_REGISTRY maps slug to
  (prefix, required_env). Both the SwitchProvider tool and the
  onboarding wizard derive their behavior from this table.
- DEFAULT_MODEL=openai_compat/<model> is a sentinel that
  config._resolve() unwraps to LiteLLM's openai/<model> with the
  dedicated credentials passed via base_url and api_key.
- get_active_provider() classifies via longest-prefix-wins lookup
  (so azure_ai/ matches before azure/) and returns "unknown" for
  unrecognized litellm/<vendor>/<model> strings.

Runtime switching
- New SwitchProvider tool in orchestrator/tools/, registered only on
  the orchestrator. Users say "switch to ollama llama3.1" or
  "/switch-provider azure_ai claude-opus-4-1"; the tool validates
  credentials, writes DEFAULT_MODEL to .env atomically, and signals
  run_utils.main() to rebuild the agency on next TUI exit. The
  orchestrator's "router only" contract is preserved with a single
  documented carve-out for this administrative concern.
- The FastAPI server (server.py) doesn't read the restart signal —
  switching from an API client is a documented no-op.
- Restart flag files live in a user-scoped tempdir (mode 0o700) so
  a co-tenant on /tmp can't force a spurious restart.

Hardening
- SSRF defense: SwitchProvider refuses any openai_compat switch where
  OPENAI_COMPAT_API_BASE isn't an https:// URL with a real hostname.
  Closes the prompt-injection chain where an attacker pre-positions
  the base URL and induces a switch, redirecting all subsequent LLM
  traffic (with bearer tokens and conversation history).
- Input validation: model field requires alphanumeric start + the
  characters real model names use ([\w.:-/]). Blocks newline
  injection into .env, shell metacharacters, and `..`-style ids.
- Atomic .env write: the restart flag is touched BEFORE the .env
  rewrite so a crash in any window leaves recoverable state. The
  rewrite uses set_key on a temp copy then os.replace to avoid
  partial-read exposure.
- config._resolve() raises RuntimeError when openai_compat is
  configured without the base URL, instead of returning a
  LitellmModel with None credentials that would fail cryptically at
  first call.
- The except clause in _resolve catches only ImportError;
  TypeError now propagates so misconfigured kwargs surface
  immediately rather than degrading to a bare model string.

Tests
- 36 pytest cases cover provider validation, SSRF guard,
  input validation, atomic write recovery, missing-credential
  errors, prefix classification (incl. longest-prefix-wins for
  azure_ai/ vs azure/), openai_compat unwrap to openai/<model>,
  RuntimeError on missing API_BASE, ImportError graceful degradation,
  TypeError propagation, dotenv quoting round-trips, OSError on flag
  touch refuses switch, and the wizard's PROVIDERS data shape contract.
- Test scaffolding stubs agency_swarm + openai.types.shared in
  sys.modules so the suite runs from a bare Python install with just
  pytest + python-dotenv + pydantic — no need for the full
  agency-swarm dependency chain.

Documentation
- README updated: 7-provider list, runtime switch description,
  upgrading-from-earlier-version section.
- AGENTS.md documents the orchestrator/tools/ convention and the
  PROVIDER_REGISTRY contract.
- orchestrator/instructions.md documents the administrative carve-out.
- .env.example documents every new env var with vendor URL examples
  for openai_compat (Groq, Together, Mistral, OpenRouter; Ollama
  Cloud points at https://docs.ollama.com since the canonical
  endpoint can change).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@nyimbi
Copy link
Copy Markdown
Author

nyimbi commented May 9, 2026

Live verification — Azure AI Foundry path

Sent a real prompt to Azure-hosted Claude using the credential layout this PR documents. Results from the new tests/test_live_providers.py::test_live_azure_ai_foundry_claude:

  • Routed through azure_ai/claude-sonnet-4-6
  • Endpoint: https://<resource>.services.ai.azure.com/anthropic (the /anthropic suffix this PR specifically calls out)
  • Real response received from Claude Sonnet 4-6

This validates the Azure-Anthropic-on-Foundry path end-to-end. The new live test module is on the stacked branch (feat/runtime-switch-fastapi, PR #27) so a fresh checkout including this PR plus that one can reproduce with:

pytest tests/test_live_providers.py -v

Tests skip cleanly when their credentials aren't in env (so CI without keys won't break).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant