feat(provider): add Rapid-MLX as named provider by raullenchai · Pull Request #24325 · BerriAI/litellm

raullenchai · 2026-03-21T21:31:47Z

Summary

Adds Rapid-MLX as a named provider via the JSON provider registry.

Rapid-MLX is an OpenAI-compatible inference server optimized for Apple Silicon (MLX), offering 2-4x faster inference than Ollama with full tool calling, reasoning separation, and prompt caching.

Changes

litellm/llms/openai_like/providers.json — add rapid_mlx entry with base_url, api_key_env, and default_api_key
litellm/types/utils.py — add RAPID_MLX = "rapid_mlx" to LlmProviders enum
provider_endpoints_support.json — add rapid_mlx provider with supported endpoints
docs/my-website/docs/providers/rapid_mlx.md — provider documentation page (install, SDK usage, proxy config, env vars)
docs/my-website/sidebars.js — add rapid_mlx to provider docs navigation
tests/test_litellm/llms/rapid_mlx/test_rapid_mlx_completion.py — unit tests for provider routing, custom API base, and custom API key

Usage

from litellm import completion

response = completion(
    model="rapid_mlx/default",
    messages=[{"role": "user", "content": "Hello!"}],
)

Test plan

Unit tests verify provider routing (rapid_mlx/ prefix -> correct api_base and api_key)
Unit tests verify RAPID_MLX_API_BASE env var override
Unit tests verify RAPID_MLX_API_KEY env var override
CI linting and unit tests pass

Adds Rapid-MLX (https://github.com/raullenchai/Rapid-MLX) as a named provider via the JSON provider registry. Rapid-MLX is an OpenAI-compatible inference server optimized for Apple Silicon (MLX), offering 2-4x faster inference than Ollama with full tool calling and prompt caching. Changes: - providers.json: add rapid_mlx entry - types/utils.py: add RAPID_MLX to LlmProviders enum - provider_endpoints_support.json: add rapid_mlx provider docs - docs: add provider documentation page - sidebars.js: add rapid_mlx to docs navigation - tests: add unit tests for provider routing

vercel · 2026-03-21T21:31:52Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 21, 2026 9:33pm

CLAassistant · 2026-03-21T21:31:54Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Raullen seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codspeed-hq · 2026-03-21T21:33:59Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing raullenchai:feat/rapid-mlx-provider (2b27d30) with main (2ea9e20)}

greptile-apps · 2026-03-21T21:34:21Z

Greptile Summary

This PR adds Rapid-MLX as a named OpenAI-compatible provider via the JSON provider registry, following the same pattern as other local inference providers (e.g., charity_engine, publicai). The overall approach is correct, but two issues need to be addressed before merging.

Key changes:

litellm/llms/openai_like/providers.json — registers rapid_mlx with base URL, API key env var, and default key
litellm/types/utils.py — adds RAPID_MLX to the LlmProviders enum
provider_endpoints_support.json — declares supported endpoints for the UI/docs
docs/ — provider documentation and sidebar entry
tests/ — mock-only unit tests for routing, API base override, and API key override

Issues found:

Missing api_base_env in providers.json: The rapid_mlx entry omits "api_base_env": "RAPID_MLX_API_BASE". The dynamic_config.py resolution logic gates env-var base-URL lookup on this field being present, so RAPID_MLX_API_BASE overrides are silently ignored. The test_rapid_mlx_custom_api_base test will fail as a result.
Unsubstantiated a2a/interactions capability flags: provider_endpoints_support.json marks a2a and interactions as true for rapid_mlx, but there is no documentation or evidence that Rapid-MLX implements these protocols. Other comparable local inference providers in the same file omit these fields.

Confidence Score: 2/5

Not safe to merge — a missing api_base_env field causes env-var URL overrides to silently break, and unverified A2A/interactions capabilities are advertised.
Two functional issues block a clean merge: (1) the api_base_env omission in providers.json breaks the documented and tested RAPID_MLX_API_BASE override, meaning test_rapid_mlx_custom_api_base will fail; (2) a2a and interactions are advertised as supported without any evidence from the Rapid-MLX project. Both are straightforward one-line or two-line fixes.
litellm/llms/openai_like/providers.json (missing api_base_env) and provider_endpoints_support.json (unverified a2a/interactions flags) need attention before merging.

Important Files Changed

Filename	Overview
litellm/llms/openai_like/providers.json	Adds `rapid_mlx` provider entry; missing the required `api_base_env` field, causing `RAPID_MLX_API_BASE` env var overrides to be silently ignored.
provider_endpoints_support.json	Adds `rapid_mlx` endpoint capabilities; incorrectly marks `a2a` and `interactions` as `true` without evidence of actual support in Rapid-MLX.
litellm/types/utils.py	Adds `RAPID_MLX = "rapid_mlx"` to the `LlmProviders` enum — straightforward and correct.
tests/test_litellm/llms/rapid_mlx/test_rapid_mlx_completion.py	Unit tests mock `openai_chat_completions.completion` (no real network calls — compliant with repo rules); however, `test_rapid_mlx_custom_api_base` will fail at runtime because the missing `api_base_env` in providers.json prevents env-var resolution.
docs/my-website/docs/providers/rapid_mlx.md	Provider documentation page — well-structured, documents env vars, usage examples, and supported models. No issues.
docs/my-website/sidebars.js	Adds `providers/rapid_mlx` to the sidebar in alphabetical order — correct placement between `ragflow` and `recraft`.
tests/test_litellm/llms/rapid_mlx/init.py	Empty `__init__.py` to mark the test directory as a Python package — no issues.

Sequence Diagram

sequenceDiagram
    participant U as User
    participant LC as litellm.completion
    participant JL as json_loader
    participant DC as dynamic_config
    participant OAI as openai_chat_completions

    U->>LC: completion(model="rapid_mlx/default")
    LC->>JL: load providers.json rapid_mlx entry
    JL-->>DC: SimpleProviderConfig(base_url, api_key_env, api_base_env=None)
    LC->>DC: _get_openai_compatible_provider_info
    Note over DC: api_base_env is None so env var override is skipped
    Note over DC: Falls back to hardcoded localhost default URL
    DC-->>LC: resolved_base and resolved_key
    LC->>OAI: completion with resolved api_base and api_key
    OAI-->>U: Response

_{Last reviewed commit: "feat(provider): add ..."}

greptile-apps · 2026-03-21T21:34:24Z

litellm/llms/openai_like/providers.json

+  "rapid_mlx": {
+    "base_url": "http://localhost:8000/v1",
+    "api_key_env": "RAPID_MLX_API_KEY",
+    "default_api_key": "not-needed"
  }


Missing api_base_env — RAPID_MLX_API_BASE env var will never be read

The rapid_mlx entry is missing the "api_base_env" field. Without it, provider.api_base_env is None in dynamic_config.py, so the following branch is never reached:

# dynamic_config.py L68-69 if not resolved_base and provider.api_base_env: resolved_base = get_secret_str(provider.api_base_env)

This means the RAPID_MLX_API_BASE environment variable documented in the docs and verified by test_rapid_mlx_custom_api_base will be silently ignored — the provider always falls through to the hardcoded "http://localhost:8000/v1" default. Compare with the publicai entry directly above this block, which includes "api_base_env": "PUBLICAI_API_BASE".

Suggested change

"rapid_mlx": {

"base_url": "http://localhost:8000/v1",

"api_key_env": "RAPID_MLX_API_KEY",

"default_api_key": "not-needed"

}

"rapid_mlx": {

"base_url": "http://localhost:8000/v1",

"api_key_env": "RAPID_MLX_API_KEY",

"api_base_env": "RAPID_MLX_API_BASE",

"default_api_key": "not-needed"

}

greptile-apps · 2026-03-21T21:34:25Z

provider_endpoints_support.json

+        "rerank": false,
+        "a2a": true,
+        "interactions": true
+      }


Unsupported a2a and interactions capabilities enabled

"a2a": true and "interactions": true are set for rapid_mlx, but neither the Rapid-MLX documentation, the PR description, nor the rapid_mlx.md provider page mentions support for the A2A (Agent-to-Agent) protocol or the Interactions endpoint.

Contrast with the immediately preceding charity_engine entry (lines 2557–2571), which is also an OpenAI-compatible provider and omits both fields entirely. Unless Rapid-MLX explicitly implements these protocols, advertising them here will route requests to a server that doesn't support them, leading to runtime errors for users who rely on them.

These fields should be removed (or set to false) until A2A/Interactions support is confirmed and documented:

Suggested change

"rerank": false,

"a2a": true,

"interactions": true

}

"batches": false,

"rerank": false

vercel bot deployed to Preview March 21, 2026 21:33 View deployment

greptile-apps bot reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(provider): add Rapid-MLX as named provider#24325

feat(provider): add Rapid-MLX as named provider#24325
raullenchai wants to merge 1 commit intoBerriAI:mainfrom
raullenchai:feat/rapid-mlx-provider

raullenchai commented Mar 21, 2026

Uh oh!

vercel bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 21, 2026

Uh oh!

codspeed-hq bot commented Mar 21, 2026

Uh oh!

greptile-apps bot commented Mar 21, 2026

Important Files Changed

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

greptile-apps bot Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

raullenchai commented Mar 21, 2026

Summary

Changes

Usage

Test plan

Uh oh!

vercel bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Mar 21, 2026

Uh oh!

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 21, 2026

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 21, 2026 •

edited

Loading