Awesome GPT-OSS

A curated list of awesome GPT-OSS resources, tools, tutorials, and projects. OpenAI's first fully open-source language model family since GPT-2.

GPT-OSS represents OpenAI's return to open-source AI development with two powerful reasoning models: gpt-oss-120b and gpt-oss-20b. Released under the Apache 2.0 license, these models deliver state-of-the-art performance with configurable reasoning effort, full chain-of-thought access, and native tool use capabilities.

📋 Table of Contents

🏢 Official Resources
🤖 Models
🚀 Inference Engines
💻 Local Deployment
☁️ Cloud Deployment
🛠️ Development Tools
🔗 Integrations
🎯 Fine-tuning
📱 Applications
📚 Tutorials
🔬 Research
🛡️ Safety
👥 Community
📊 Comparison with Other Models
🎉 Contributing
📄 License
⭐ Star History

🏢 Official Resources

OpenAI GPT-OSS Announcement - Official release announcement
GPT-OSS GitHub Repository - Official implementation and reference code
GPT-OSS Model Card - Comprehensive model documentation
Open Models Page - OpenAI's dedicated open models page
OpenAI Harmony - Response format library for GPT-OSS
Try gpt-oss - gpt-oss playground

🤖 Models

Hugging Face Models

gpt-oss-120b - 120B parameter model (117B total, 5.1B active)
gpt-oss-20b - 20B parameter model (21B total, 3.6B active)

Model Specifications

Model	Parameters	Active Parameters	Memory Requirement	Hardware
gpt-oss-120b	117B	5.1B	80GB	Single H100
gpt-oss-20b	21B	3.6B	16GB	Consumer GPU

Key Features

Apache 2.0 License - Permissive open-source licensing
MXFP4 Quantization - Native 4-bit quantization for efficient inference
Mixture of Experts (MoE) - Optimized for performance and efficiency
Configurable Reasoning - Adjustable effort levels (low, medium, high)
Full Chain-of-Thought - Complete access to reasoning process
Tool Use Capabilities - Web browsing, Python execution, function calling

🚀 Inference Engines

vLLM

vLLM GPT-OSS Support - Official vLLM implementation
Flash Attention 3 Kernels - Optimized attention kernels for Hopper GPUs
Installation: pip install --pre vllm==0.10.1+gptoss

Ollama

Ollama GPT-OSS Models - Easy local deployment
OpenAI Cookbook - Ollama Guide - Official tutorial
Quick start: ollama pull gpt-oss:20b && ollama run gpt-oss:20b

llama.cpp

llama.cpp GPT-OSS Support - CPU and GPU inference
GGUF Models - Quantized models for llama.cpp

Transformers

Hugging Face Transformers - Official integration
Transformers Serve - OpenAI-compatible server

💻 Local Deployment

Consumer Hardware

plux — The fastest way to connect your files to AI. Think file explorer + “add to AI” button — discover, send, and manage your files with one click.
LM Studio - User-friendly desktop application
Jan - Open-source ChatGPT alternative
Msty - Multi-platform LLM client
Cherry Studio - Desktop client with Ollama support

Enterprise Hardware

NVIDIA RTX Optimization - RTX-optimized deployment
Apple Metal Implementation - Native Metal support for Apple Silicon
AMD ROCm Support - AMD GPU compatibility

☁️ Cloud Deployment

Major Cloud Providers

Azure AI Foundry - Microsoft's AI platform
Hugging Face Inference Providers - Multi-provider access
AWS SageMaker - Amazon's ML platform
Northflank - GPU-optimized deployment
Fireworks AI - High-performance inference
Cerebras - Ultra-fast inference (2-4k tokens/sec)

Edge Computing

Microsoft AI Foundry Local - On-device inference for Windows
Ollama Turbo - Hosted Ollama service for large models

🛠️ Development Tools

Python Libraries

gpt-oss - Official Python package
OpenAI Python SDK - Compatible with local endpoints
LangChain - LLM application framework
LiteLLM - Unified API across providers

JavaScript/TypeScript

Responses.js - Response API client library
Vercel AI SDK - React/Next.js integration
OpenAI JS SDK - Node.js client

APIs and Protocols

Chat Completions API - Compatible with OpenAI format
Responses API - Advanced streaming interface
OpenAI Harmony Format - New response format

🔗 Integrations

Chat Interfaces

plux — The fastest way to connect your files to AI. Think file explorer + “add to AI” button — discover, send, and manage your files with one click.
Open WebUI - Feature-rich web interface
ChatGPT-Next-Web - Self-hosted ChatGPT UI
LibreChat - Multi-model chat platform
LobeChat - Modern chat interface

IDE Extensions

Continue - Open-source AI code assistant
AI Toolkit for VSCode - Microsoft's official VSCode extension
CodeGPT - IntelliJ plugin

Agent Frameworks

OpenAI Agents SDK - Official agent development framework
AutoGen - Multi-agent conversation framework
CrewAI - Role-playing AI agents
LangGraph - Agent workflow orchestration

🎯 Fine-tuning

Training Frameworks

TRL (Transformer Reinforcement Learning) - Hugging Face training library
OpenAI Cookbook - LoRA Fine-tuning - Official LoRA example
Unsloth - Fast fine-tuning framework
QLoRA - Quantized fine-tuning

Hardware Requirements

gpt-oss-120b: Single H100 node for LoRA fine-tuning
gpt-oss-20b: Consumer hardware compatible
Techniques: LoRA, QLoRA, Parameter-Efficient Fine-Tuning (PEFT)

📱 Applications

Chatbots and Assistants

Anything LLM - Private document chatbot
Perplexica - AI-powered search engine
Dify - LLM application development platform
FlowiseAI - Visual LLM app builder

Coding Assistants

Aider - AI pair programming
GPT Engineer - Code generation from specs
Open Interpreter - Local code interpreter
MetaGPT - Multi-agent software development

Research and Analysis

Paper QA - Scientific paper analysis
LlamaIndex - Document indexing and search
RAG Flow - Retrieval-Augmented Generation
Chroma - Vector database for AI

📚 Tutorials

Getting Started

OpenAI Cookbook - GPT-OSS Guide - Official comprehensive guide
How to Run GPT-OSS Locally - Step-by-step local setup
GPT-OSS with vLLM - Production deployment guide
Harmony Response Format - Understanding the new format

Advanced Usage

Fine-tuning GPT-OSS - Custom model training
Building AI Agents - Agent development with GPT-OSS
Tool Use Examples - Browser and Python tools

Third-party Tutorials

GPT-OSS Setup on AWS - Complete AWS deployment guide
GPU Optimization Guide - Hardware-specific optimizations
Docker Deployment - Containerized deployment

🔬 Research

Academic Papers

GPT-OSS Model Paper - Technical specifications and benchmarks
Mixture of Experts Research - MoE architecture foundations
MXFP4 Quantization - 4-bit quantization techniques

Benchmarks and Evaluations

Reasoning: Near-parity with o4-mini on core benchmarks
Coding: Strong performance on Codeforces competitions
Mathematics: Excellent results on AIME 2024 & 2025
Tool Use: Superior performance on TauBench agentic evaluation
Health: Outperforms proprietary models on HealthBench

Performance Analysis

Simon Willison's Analysis - Independent technical review
Comparative Benchmarks - Performance vs other models
Enterprise Adoption Study - Market analysis

🛡️ Safety

Security Features

Preparedness Framework Testing - Adversarial fine-tuning results
Red Teaming Challenge - $500,000 safety challenge
Safety Advisory Group Review - External expert evaluation

Safety Tools

Content Filtering - Content moderation tools
Chain-of-Thought Monitoring - Reasoning transparency
Usage Policy - Model usage guidelines

👥 Community

Discussion Forums

OpenAI Developer Forum - Official community
Hugging Face Forums - ML community discussions
Reddit r/LocalLLaMA - Local model enthusiasts
Discord Servers - Real-time community chat

GitHub Organizations

OpenAI - Official repositories
Hugging Face - ML ecosystem
vLLM Team - Inference optimization
Ollama - Local deployment tools

News and Updates

OpenAI Blog - Official announcements
Hugging Face Blog - Technical deep-dives
AI Research Twitter - Latest developments
Papers with Code - Research tracking

📊 Comparison with Other Models

Feature	GPT-OSS-120b	GPT-OSS-20b	Meta Llama 3.3 70b	DeepSeek-R1
License	Apache 2.0	Apache 2.0	Custom License	MIT
Parameters	117B (5.1B active)	21B (3.6B active)	70B	671B (37B active)
Memory	80GB	16GB	140GB	340GB
Reasoning	✅ High	✅ Medium	❌ Limited	✅ Excellent
Tool Use	✅ Native	✅ Native	⚠️ Basic	✅ Advanced
CoT Access	✅ Full	✅ Full	❌ Hidden	✅ Full

🎉 Contributing

Contributions are welcome! Please read the contribution guidelines first.

How to Contribute

Fork this repository
Create a new branch for your addition
Add your resource with a brief description
Ensure it follows the existing format
Submit a pull request

Criteria for Inclusion

Must be related to GPT-OSS models
Should be actively maintained
Must be publicly available
Should provide clear value to the community

📄 License

This awesome list is licensed under the CC0 1.0 Universal license.

⭐ Star History

Made with ❤️ by the community. If you find this list helpful, please ⭐ star it and share with others!

Note: GPT-OSS models require the harmony response format to function correctly. Always use the provided chat templates or the OpenAI harmony library for proper interaction.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README-v0.md		README-v0.md
README.md		README.md

License

milisp/awesome-gpt-oss

Folders and files

Latest commit

History

Repository files navigation