A curated list of awesome GPT-OSS resources, tools, tutorials, and projects. OpenAI's first fully open-source language model family since GPT-2.
GPT-OSS represents OpenAI's return to open-source AI development with two powerful reasoning models: gpt-oss-120b and gpt-oss-20b. Released under the Apache 2.0 license, these models deliver state-of-the-art performance with configurable reasoning effort, full chain-of-thought access, and native tool use capabilities.
- π’ Official Resources
- π€ Models
- π Inference Engines
- π» Local Deployment
- βοΈ Cloud Deployment
- π οΈ Development Tools
- π Integrations
- π― Fine-tuning
- π± Applications
- π Tutorials
- π¬ Research
- π‘οΈ Safety
- π₯ Community
- π Comparison with Other Models
- π Contributing
- π License
- β Star History
- OpenAI GPT-OSS Announcement - Official release announcement
- GPT-OSS GitHub Repository - Official implementation and reference code
- GPT-OSS Model Card - Comprehensive model documentation
- Open Models Page - OpenAI's dedicated open models page
- OpenAI Harmony - Response format library for GPT-OSS
- Try gpt-oss - gpt-oss playground
- gpt-oss-120b - 120B parameter model (117B total, 5.1B active)
- gpt-oss-20b - 20B parameter model (21B total, 3.6B active)
Model | Parameters | Active Parameters | Memory Requirement | Hardware |
---|---|---|---|---|
gpt-oss-120b | 117B | 5.1B | 80GB | Single H100 |
gpt-oss-20b | 21B | 3.6B | 16GB | Consumer GPU |
- Apache 2.0 License - Permissive open-source licensing
- MXFP4 Quantization - Native 4-bit quantization for efficient inference
- Mixture of Experts (MoE) - Optimized for performance and efficiency
- Configurable Reasoning - Adjustable effort levels (low, medium, high)
- Full Chain-of-Thought - Complete access to reasoning process
- Tool Use Capabilities - Web browsing, Python execution, function calling
- vLLM GPT-OSS Support - Official vLLM implementation
- Flash Attention 3 Kernels - Optimized attention kernels for Hopper GPUs
- Installation:
pip install --pre vllm==0.10.1+gptoss
- Ollama GPT-OSS Models - Easy local deployment
- OpenAI Cookbook - Ollama Guide - Official tutorial
- Quick start:
ollama pull gpt-oss:20b && ollama run gpt-oss:20b
- llama.cpp GPT-OSS Support - CPU and GPU inference
- GGUF Models - Quantized models for llama.cpp
- Hugging Face Transformers - Official integration
- Transformers Serve - OpenAI-compatible server
- plux β The fastest way to connect your files to AI. Think file explorer + βadd to AIβ button β discover, send, and manage your files with one click.
- LM Studio - User-friendly desktop application
- Jan - Open-source ChatGPT alternative
- Msty - Multi-platform LLM client
- Cherry Studio - Desktop client with Ollama support
- NVIDIA RTX Optimization - RTX-optimized deployment
- Apple Metal Implementation - Native Metal support for Apple Silicon
- AMD ROCm Support - AMD GPU compatibility
- Azure AI Foundry - Microsoft's AI platform
- Hugging Face Inference Providers - Multi-provider access
- AWS SageMaker - Amazon's ML platform
- Northflank - GPU-optimized deployment
- Fireworks AI - High-performance inference
- Cerebras - Ultra-fast inference (2-4k tokens/sec)
- Microsoft AI Foundry Local - On-device inference for Windows
- Ollama Turbo - Hosted Ollama service for large models
- gpt-oss - Official Python package
- OpenAI Python SDK - Compatible with local endpoints
- LangChain - LLM application framework
- LiteLLM - Unified API across providers
- Responses.js - Response API client library
- Vercel AI SDK - React/Next.js integration
- OpenAI JS SDK - Node.js client
- Chat Completions API - Compatible with OpenAI format
- Responses API - Advanced streaming interface
- OpenAI Harmony Format - New response format
- plux β The fastest way to connect your files to AI. Think file explorer + βadd to AIβ button β discover, send, and manage your files with one click.
- Open WebUI - Feature-rich web interface
- ChatGPT-Next-Web - Self-hosted ChatGPT UI
- LibreChat - Multi-model chat platform
- LobeChat - Modern chat interface
- Continue - Open-source AI code assistant
- AI Toolkit for VSCode - Microsoft's official VSCode extension
- CodeGPT - IntelliJ plugin
- OpenAI Agents SDK - Official agent development framework
- AutoGen - Multi-agent conversation framework
- CrewAI - Role-playing AI agents
- LangGraph - Agent workflow orchestration
- TRL (Transformer Reinforcement Learning) - Hugging Face training library
- OpenAI Cookbook - LoRA Fine-tuning - Official LoRA example
- Unsloth - Fast fine-tuning framework
- QLoRA - Quantized fine-tuning
- gpt-oss-120b: Single H100 node for LoRA fine-tuning
- gpt-oss-20b: Consumer hardware compatible
- Techniques: LoRA, QLoRA, Parameter-Efficient Fine-Tuning (PEFT)
- Anything LLM - Private document chatbot
- Perplexica - AI-powered search engine
- Dify - LLM application development platform
- FlowiseAI - Visual LLM app builder
- Aider - AI pair programming
- GPT Engineer - Code generation from specs
- Open Interpreter - Local code interpreter
- MetaGPT - Multi-agent software development
- Paper QA - Scientific paper analysis
- LlamaIndex - Document indexing and search
- RAG Flow - Retrieval-Augmented Generation
- Chroma - Vector database for AI
- OpenAI Cookbook - GPT-OSS Guide - Official comprehensive guide
- How to Run GPT-OSS Locally - Step-by-step local setup
- GPT-OSS with vLLM - Production deployment guide
- Harmony Response Format - Understanding the new format
- Fine-tuning GPT-OSS - Custom model training
- Building AI Agents - Agent development with GPT-OSS
- Tool Use Examples - Browser and Python tools
- GPT-OSS Setup on AWS - Complete AWS deployment guide
- GPU Optimization Guide - Hardware-specific optimizations
- Docker Deployment - Containerized deployment
- GPT-OSS Model Paper - Technical specifications and benchmarks
- Mixture of Experts Research - MoE architecture foundations
- MXFP4 Quantization - 4-bit quantization techniques
- Reasoning: Near-parity with o4-mini on core benchmarks
- Coding: Strong performance on Codeforces competitions
- Mathematics: Excellent results on AIME 2024 & 2025
- Tool Use: Superior performance on TauBench agentic evaluation
- Health: Outperforms proprietary models on HealthBench
- Simon Willison's Analysis - Independent technical review
- Comparative Benchmarks - Performance vs other models
- Enterprise Adoption Study - Market analysis
- Preparedness Framework Testing - Adversarial fine-tuning results
- Red Teaming Challenge - $500,000 safety challenge
- Safety Advisory Group Review - External expert evaluation
- Content Filtering - Content moderation tools
- Chain-of-Thought Monitoring - Reasoning transparency
- Usage Policy - Model usage guidelines
- OpenAI Developer Forum - Official community
- Hugging Face Forums - ML community discussions
- Reddit r/LocalLLaMA - Local model enthusiasts
- Discord Servers - Real-time community chat
- OpenAI - Official repositories
- Hugging Face - ML ecosystem
- vLLM Team - Inference optimization
- Ollama - Local deployment tools
- OpenAI Blog - Official announcements
- Hugging Face Blog - Technical deep-dives
- AI Research Twitter - Latest developments
- Papers with Code - Research tracking
Feature | GPT-OSS-120b | GPT-OSS-20b | Meta Llama 3.3 70b | DeepSeek-R1 |
---|---|---|---|---|
License | Apache 2.0 | Apache 2.0 | Custom License | MIT |
Parameters | 117B (5.1B active) | 21B (3.6B active) | 70B | 671B (37B active) |
Memory | 80GB | 16GB | 140GB | 340GB |
Reasoning | β High | β Medium | β Limited | β Excellent |
Tool Use | β Native | β Native | β Advanced | |
CoT Access | β Full | β Full | β Hidden | β Full |
Contributions are welcome! Please read the contribution guidelines first.
- Fork this repository
- Create a new branch for your addition
- Add your resource with a brief description
- Ensure it follows the existing format
- Submit a pull request
- Must be related to GPT-OSS models
- Should be actively maintained
- Must be publicly available
- Should provide clear value to the community
This awesome list is licensed under the CC0 1.0 Universal license.
Made with β€οΈ by the community. If you find this list helpful, please β star it and share with others!
Note: GPT-OSS models require the harmony response format to function correctly. Always use the provided chat templates or the OpenAI harmony library for proper interaction.