Skip to content

brekkylab/gpt-oss-tvm

Repository files navigation

gpt-oss-tvm

License

TVM gpt-oss Wiki

This project aims to compile OpenAI gpt-oss model using Apache TVM and run it on the target device.

Project Goals

Wiki

Visit Wiki Home or Design Philosophy page to read more for the project goal and objectives!

Setup

To support gpt-oss correctly, TVM & MLC LLM needs to be built with a few patches.

Please refer to our Wiki - Setup & Run page for setup instructions.

Download model

Note

While TVM supports multiple hardware backends, this project has been mainly tested with the metal target on macOS. As the model uses the original mxfp4 and bfloat16 weights without further quantization, an Apple Silicon Mac with 24 GB or more of unified memory is recommended.

Files for gpt-oss reference torch implementation

pip install huggingface_hub  # to use `hf` command
hf download openai/gpt-oss-20b --include "original/*" --local-dir gpt-oss-20b/

Compile & Run

Important

To ensure equivalence with gpt-oss, please confirm that a TVM built with the patches applied.
You can install the desired TVM & MLC LLM by referring to the Wiki Setup page.

Basic single-turn test

python run_gpt_oss.py

Multi-turn chat

python chat.py

Use other target devices

The target device can be changed by modifying the following line in the scripts:

- engine = Engine(model_path, target="metal")
+ engine = Engine(model_path, target="<YOUR DEVICE TYPE>")

Supported device types are determined by TVM target support.

License

This project follows the Apache License 2.0, in line with the licenses of gpt-oss and TVM.

Authors

  • @Liberatedwinner
  • @grf53
  • @jhlee525
  • @khj809

About

Compile & Run OpenAI's gpt-oss through TVM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages