Official example notebooks for MLX-LM-LoRA - a powerful library for efficient LLM training on Apple Silicon.
This repository contains real-world, production-ready examples demonstrating how to use MLX-LM-LoRA for various training scenarios on Apple Silicon Macs. Each notebook is a complete, step-by-step tutorial that you can run directly on your Mac.
- ⚡ Ultra-efficient: Train 4B+ parameter models on 16GB RAM
- 🎯 Long context: Support for 32K+ token sequences
- 🔧 LoRA optimized: Memory-efficient fine-tuning with Low-Rank Adaptation
- 💾 Quantization: 4-bit, 6-bit, and 8-bit quantization support
- 🍎 Apple Silicon native: Optimized for M1/M2/M3/M4 chips
Located in finetuning/
Learn how to fine-tune large language models using Supervised Fine-Tuning (SFT):
- Qwen3_4B_Instruct.ipynb: Complete tutorial for instruction fine-tuning
- Dataset preparation and formatting
- LoRA configuration and optimization
- Training with long context (32K+ tokens)
- Model evaluation and comparison
- Saving and sharing on Hugging Face Hub
Located in preference/
Coming soon! Examples including:
- Direct Preference Optimization (DPO)
- RLHF with human preferences
- Reward model training
- Policy optimization
Located in rl/
Coming soon! Examples including:
- Proximal Policy Optimization (PPO)
- Reward-based training
- Multi-turn dialogue optimization
- Task-specific RL fine-tuning
- Apple Silicon Mac (M1/M2/M3/M4 series)
- macOS 13.0 or later
- Python 3.8+
- At least 16GB unified memory (24GB+ recommended for larger models)
- Clone this repository:
git clone https://github.com/Goekdeniz-Guelmez/mlx-lm-lora-example-notebooks.git
cd mlx-lm-lora-example-notebooks- Install MLX-LM-LoRA:
pip install -U mlx-lm-lora- Install jupyter:
pip install -U jupyter- Install wandb:
pip install -U wandb- Open the notebook you want to try:
jupyter notebook finetuning/Qwen3_4B_Instruct_32k.ipynbOr use VS Code with the Jupyter extension for the best experience!
- Choose your use case: Pick a notebook from the categories above
- Follow the tutorial: Each notebook has detailed explanations for every step
- Customize: Adjust hyperparameters for your specific needs
- Train: Run the cells and watch your model improve!
- Share: Upload your fine-tuned adapters to Hugging Face Hub
- Adjust
batch_sizebased on your available RAM - Use
gradient_accumulation_stepsto simulate larger batches - Enable
grad_checkpointfor training with ultra-long contexts - Use 4-bit quantization for maximum memory efficiency
- Increase
seq_step_sizeif you have more memory - Use larger
batch_sizeon M2/M3 Ultra systems - Adjust
num_layersin LoRA config to train fewer layers
- Increase LoRA
rankfor more expressive adaptations - Train for more epochs on larger datasets
- Use higher precision (6-bit or 8-bit) if memory allows
- Fine-tune validation strategy for your specific task
We welcome contributions! If you have interesting examples or improvements:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-example) - Commit your changes (
git commit -m 'Add amazing example') - Push to the branch (
git push origin feature/amazing-example) - Open a Pull Request
- MLX-LM-LoRA Library: github.com/Goekdeniz-Guelmez/mlx-lm-lora
- MLX Framework: github.com/ml-explore/mlx
- Documentation: Check individual notebook files for detailed explanations
See finetuning/Qwen3_4B_Instruct.ipynb for a complete example of:
- Loading and quantizing Qwen3-4B-Instruct
- Preparing instruction datasets
- Training with LoRA on long contexts
- Evaluating improvements
- Sharing your adapter
Typical training time: 10-30 minutes on M2 Pro/Max
This project is licensed under the MIT License - see the LICENSE file for details.
- Apple MLX team for the amazing framework
- Hugging Face for model hosting and datasets
- The open-source AI community for continuous inspiration
Ready to start training? Pick a notebook and dive in! 🚀
For questions or issues, please open an issue on GitHub.