-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat: add new backend - MLX #7459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # Makefile # web-app/src/routes/settings/providers/$providerName.tsx
fix: notarize mlx bin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive MLX inference support for Apple Silicon Macs, enabling native GPU-accelerated model execution using the MLX framework. The implementation provides an OpenAI-compatible API via a Swift server and integrates seamlessly with Jan's existing architecture.
Changes:
- Adds new MLX Swift server with OpenAI-compatible API, streaming support, and batch processing
- Implements Rust Tauri plugin for process management and session handling
- Creates TypeScript extension for frontend integration with model loading/unloading
- Updates UI components to support MLX provider alongside existing llamacpp
- Adds prompt caching, vision model support, and tool calling capabilities
Reviewed changes
Copilot reviewed 70 out of 75 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| mlx-server/Sources/MLXServer/* | Swift HTTP server with OpenAI-compatible endpoints, batch processing, and model runner |
| src-tauri/plugins/tauri-plugin-mlx/* | Rust plugin for MLX process management, session tracking, and cleanup |
| extensions/mlx-extension/src/index.ts | TypeScript extension implementing AIEngine interface for MLX backend |
| web-app/src/services/models/* | Model service updates to support safetensors files and MLX provider |
| web-app/src/routes/settings/providers/$providerName.tsx | Provider settings UI extended for MLX support |
| web-app/src/lib/model-factory.ts | Model factory updated with MLX model creation and reasoning middleware |
| src-tauri/src/core/server/proxy.rs | Proxy server updated to route requests to both llamacpp and MLX sessions |
| Makefile | Build scripts for compiling and signing MLX server binary |
| package.json | Build tasks for MLX server compilation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ca77995 to
53495a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 70 out of 75 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
MLX Integration Architecture
Overview
This PR adds native MLX inference support for Jan, enabling Apple Silicon Macs to run MLX-optimized models (safetensors format) with Metal GPU acceleration. The implementation provides an OpenAI-compatible API via a Swift server, enabling seamless integration with the existing Jan frontend. Future optimization would come soon.
Screen.Recording.2026-02-04.at.22.55.12.mov
Architecture Diagram
Data Flow
1. Model Download
2. Model Load
3. Inference
4. Model Unload
Features Added
/v1/chat/completions/metricsfor monitoring