A Gradio chat interface for gpt-oss models that uses either the OpenAI or Groq API.
-
Clone the repository:
git clone https://github.com/zakcali/gpt-oss-gradio-chat.git cd gpt-oss-gradio-chat
-
Install the required packages:
pip install -r requirements.txt
Before running the "gpt-oss-gradio-Groq.py" script, you need to set your API key as environment variables.
export GROQ_API_KEY
You can launch the application in several ways depending on your needs.
Use the following line in the script:
demo.launch()
Modify the last line to include your machine's local IP address.
# Replace "192.168.0.xx" with your actual LAN IP address
demo.launch(server_name="192.168.0.xx", server_port=7860)
Set the share
parameter to True
. Gradio will generate a temporary public URL for you.
demo.launch(share=True)
For more details, see the official Gradio guide on Sharing Your App.
-
gpt-oss-gradio-Groq.py
: This script uses the Groq API and should run on any machine with an internet connection. -
gpt-oss-gradio-openai.py
: This script is designed to run a local model and requires substantial hardware. For example, it has been tested on a Linux machine with 4x RTX 3090 GPUs (96 GB total VRAM) using the followingvllm
command to serve the model:vllm serve openai/gpt-oss-120b --tensor-parallel-size 4 --async-scheduling
Reference: Hugging Face Discussion
This script interfaces with a locally served model using vllm, a high-throughput LLM serving library. Ensure you have vllm
installed and a compatible hardware setup before running the server command. No API key is required for this script.
Default top_p is 1.0
Displays reasoning_content in a separate box