Skip to content

Conversation

jayrodge
Copy link
Contributor

@jayrodge jayrodge commented Aug 5, 2025

Summary

Adds a comprehensive guide for optimizing OpenAI GPT-OSS models using NVIDIA TensorRT-LLM.

Changes

  • Add detailed guide for optimizing gpt-oss-20b and gpt-oss-120b models
  • Include hardware prerequisites (16GB+ VRAM, recommended GPUs)
  • Provide installation instructions for TensorRT-LLM via NGC and Docker
  • Add Python API examples for model loading and inference
  • Include performance optimization tips and next steps

Benefits

  • Helps users optimize GPT-OSS models for high-performance inference
  • Provides clear hardware requirements and setup instructions
  • Includes practical code examples for immediate use

@pap-openai pap-openai merged commit 3d32e44 into openai:main Aug 5, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants