Skip to content

feat(provider): add Rapid-MLX as named provider#24325

Open
raullenchai wants to merge 1 commit intoBerriAI:mainfrom
raullenchai:feat/rapid-mlx-provider
Open

feat(provider): add Rapid-MLX as named provider#24325
raullenchai wants to merge 1 commit intoBerriAI:mainfrom
raullenchai:feat/rapid-mlx-provider

Conversation

@raullenchai
Copy link

Summary

Adds Rapid-MLX as a named provider via the JSON provider registry.

Rapid-MLX is an OpenAI-compatible inference server optimized for Apple Silicon (MLX), offering 2-4x faster inference than Ollama with full tool calling, reasoning separation, and prompt caching.

Changes

  • litellm/llms/openai_like/providers.json — add rapid_mlx entry with base_url, api_key_env, and default_api_key
  • litellm/types/utils.py — add RAPID_MLX = "rapid_mlx" to LlmProviders enum
  • provider_endpoints_support.json — add rapid_mlx provider with supported endpoints
  • docs/my-website/docs/providers/rapid_mlx.md — provider documentation page (install, SDK usage, proxy config, env vars)
  • docs/my-website/sidebars.js — add rapid_mlx to provider docs navigation
  • tests/test_litellm/llms/rapid_mlx/test_rapid_mlx_completion.py — unit tests for provider routing, custom API base, and custom API key

Usage

from litellm import completion

response = completion(
    model="rapid_mlx/default",
    messages=[{"role": "user", "content": "Hello!"}],
)

Test plan

  • Unit tests verify provider routing (rapid_mlx/ prefix -> correct api_base and api_key)
  • Unit tests verify RAPID_MLX_API_BASE env var override
  • Unit tests verify RAPID_MLX_API_KEY env var override
  • CI linting and unit tests pass

Adds Rapid-MLX (https://github.com/raullenchai/Rapid-MLX) as a named
provider via the JSON provider registry. Rapid-MLX is an OpenAI-compatible
inference server optimized for Apple Silicon (MLX), offering 2-4x faster
inference than Ollama with full tool calling and prompt caching.

Changes:
- providers.json: add rapid_mlx entry
- types/utils.py: add RAPID_MLX to LlmProviders enum
- provider_endpoints_support.json: add rapid_mlx provider docs
- docs: add provider documentation page
- sidebars.js: add rapid_mlx to docs navigation
- tests: add unit tests for provider routing
@vercel
Copy link

vercel bot commented Mar 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 21, 2026 9:33pm

Request Review

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Raullen seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing raullenchai:feat/rapid-mlx-provider (2b27d30) with main (2ea9e20)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR adds Rapid-MLX as a named OpenAI-compatible provider via the JSON provider registry, following the same pattern as other local inference providers (e.g., charity_engine, publicai). The overall approach is correct, but two issues need to be addressed before merging.

Key changes:

  • litellm/llms/openai_like/providers.json — registers rapid_mlx with base URL, API key env var, and default key
  • litellm/types/utils.py — adds RAPID_MLX to the LlmProviders enum
  • provider_endpoints_support.json — declares supported endpoints for the UI/docs
  • docs/ — provider documentation and sidebar entry
  • tests/ — mock-only unit tests for routing, API base override, and API key override

Issues found:

  • Missing api_base_env in providers.json: The rapid_mlx entry omits "api_base_env": "RAPID_MLX_API_BASE". The dynamic_config.py resolution logic gates env-var base-URL lookup on this field being present, so RAPID_MLX_API_BASE overrides are silently ignored. The test_rapid_mlx_custom_api_base test will fail as a result.
  • Unsubstantiated a2a/interactions capability flags: provider_endpoints_support.json marks a2a and interactions as true for rapid_mlx, but there is no documentation or evidence that Rapid-MLX implements these protocols. Other comparable local inference providers in the same file omit these fields.

Confidence Score: 2/5

  • Not safe to merge — a missing api_base_env field causes env-var URL overrides to silently break, and unverified A2A/interactions capabilities are advertised.
  • Two functional issues block a clean merge: (1) the api_base_env omission in providers.json breaks the documented and tested RAPID_MLX_API_BASE override, meaning test_rapid_mlx_custom_api_base will fail; (2) a2a and interactions are advertised as supported without any evidence from the Rapid-MLX project. Both are straightforward one-line or two-line fixes.
  • litellm/llms/openai_like/providers.json (missing api_base_env) and provider_endpoints_support.json (unverified a2a/interactions flags) need attention before merging.

Important Files Changed

Filename Overview
litellm/llms/openai_like/providers.json Adds rapid_mlx provider entry; missing the required api_base_env field, causing RAPID_MLX_API_BASE env var overrides to be silently ignored.
provider_endpoints_support.json Adds rapid_mlx endpoint capabilities; incorrectly marks a2a and interactions as true without evidence of actual support in Rapid-MLX.
litellm/types/utils.py Adds RAPID_MLX = "rapid_mlx" to the LlmProviders enum — straightforward and correct.
tests/test_litellm/llms/rapid_mlx/test_rapid_mlx_completion.py Unit tests mock openai_chat_completions.completion (no real network calls — compliant with repo rules); however, test_rapid_mlx_custom_api_base will fail at runtime because the missing api_base_env in providers.json prevents env-var resolution.
docs/my-website/docs/providers/rapid_mlx.md Provider documentation page — well-structured, documents env vars, usage examples, and supported models. No issues.
docs/my-website/sidebars.js Adds providers/rapid_mlx to the sidebar in alphabetical order — correct placement between ragflow and recraft.
tests/test_litellm/llms/rapid_mlx/init.py Empty __init__.py to mark the test directory as a Python package — no issues.

Sequence Diagram

sequenceDiagram
    participant U as User
    participant LC as litellm.completion
    participant JL as json_loader
    participant DC as dynamic_config
    participant OAI as openai_chat_completions

    U->>LC: completion(model="rapid_mlx/default")
    LC->>JL: load providers.json rapid_mlx entry
    JL-->>DC: SimpleProviderConfig(base_url, api_key_env, api_base_env=None)
    LC->>DC: _get_openai_compatible_provider_info
    Note over DC: api_base_env is None so env var override is skipped
    Note over DC: Falls back to hardcoded localhost default URL
    DC-->>LC: resolved_base and resolved_key
    LC->>OAI: completion with resolved api_base and api_key
    OAI-->>U: Response
Loading

Last reviewed commit: "feat(provider): add ..."

Comment on lines +105 to 109
"rapid_mlx": {
"base_url": "http://localhost:8000/v1",
"api_key_env": "RAPID_MLX_API_KEY",
"default_api_key": "not-needed"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing api_base_envRAPID_MLX_API_BASE env var will never be read

The rapid_mlx entry is missing the "api_base_env" field. Without it, provider.api_base_env is None in dynamic_config.py, so the following branch is never reached:

# dynamic_config.py L68-69
if not resolved_base and provider.api_base_env:
    resolved_base = get_secret_str(provider.api_base_env)

This means the RAPID_MLX_API_BASE environment variable documented in the docs and verified by test_rapid_mlx_custom_api_base will be silently ignored — the provider always falls through to the hardcoded "http://localhost:8000/v1" default. Compare with the publicai entry directly above this block, which includes "api_base_env": "PUBLICAI_API_BASE".

Suggested change
"rapid_mlx": {
"base_url": "http://localhost:8000/v1",
"api_key_env": "RAPID_MLX_API_KEY",
"default_api_key": "not-needed"
}
"rapid_mlx": {
"base_url": "http://localhost:8000/v1",
"api_key_env": "RAPID_MLX_API_KEY",
"api_base_env": "RAPID_MLX_API_BASE",
"default_api_key": "not-needed"
}

Comment on lines +2586 to +2589
"rerank": false,
"a2a": true,
"interactions": true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unsupported a2a and interactions capabilities enabled

"a2a": true and "interactions": true are set for rapid_mlx, but neither the Rapid-MLX documentation, the PR description, nor the rapid_mlx.md provider page mentions support for the A2A (Agent-to-Agent) protocol or the Interactions endpoint.

Contrast with the immediately preceding charity_engine entry (lines 2557–2571), which is also an OpenAI-compatible provider and omits both fields entirely. Unless Rapid-MLX explicitly implements these protocols, advertising them here will route requests to a server that doesn't support them, leading to runtime errors for users who rely on them.

These fields should be removed (or set to false) until A2A/Interactions support is confirmed and documented:

Suggested change
"rerank": false,
"a2a": true,
"interactions": true
}
"batches": false,
"rerank": false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants