From a3aefd70e7b267d7c5c66bc0a8c135ac88eebe9d Mon Sep 17 00:00:00 2001
From: Aaron Pham <contact@aarnphm.xyz>
Date: Mon, 25 Aug 2025 14:27:52 -0400
Subject: [PATCH 1/4] Update GPT-OSS documentation with function calling

---
 OpenAI/GPT-OSS.md | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/OpenAI/GPT-OSS.md b/OpenAI/GPT-OSS.md
index 49022fd..a504543 100644
--- a/OpenAI/GPT-OSS.md
+++ b/OpenAI/GPT-OSS.md
@@ -169,7 +169,15 @@ mcp run -t sse python_server.py:mcp
 vllm serve ... --tool-server ip-1:port-1,ip-2:port-2
 ```
 
-The URLs are expected to be MCP SSE servers that implement `instructions` in server info and well documented tools. The tools will be injected into the system prompt for the model to enable them. 
+The URLs are expected to be MCP SSE servers that implement `instructions` in server info and well documented tools. The tools will be injected into the system prompt for the model to enable them.
+
+### Function calling
+
+vLLM supports function calling for Chat Completion API. Make sure to run your gpt-oss models with the following:
+
+```bash
+vllm serve openai/gpt-oss-20b --tool-call-parser openai --reasoning-parser openai_gptoss --enable-auto-tool-choice
+```
 
 ## Accuracy Evaluation Panels
 
@@ -265,8 +273,8 @@ Meaning:
 | Response API | ✅ | ✅ | ✅ | ✅ | ✅ |
 | Response API with Background Mode | ✅ | ✅ | ✅ | ❌ | ✅ |
 | Response API with Streaming | ✅ | ✅ | ❌ | ❌ | ❌ |
-| Chat Completion API | ✅ | ✅ | ❌ | ❌ | ❌ |
-| Chat Completion API with Streaming | ✅ | ✅ | ❌ | ❌ | ❌ |
+| Chat Completion API | ✅ | ✅ | ❌ | ❌ | ✅  |
+| Chat Completion API with Streaming | ✅ | ✅ | ❌ | ❌ | ✅  |
 
 
 If you want to use offline inference, you can treat vLLM as a token-in-token-out service and pass in tokens that are already formatted with Harmony.

From e165b6baa0bf049302289ead270e04217071d9da Mon Sep 17 00:00:00 2001
From: Aaron Pham <contact@aarnphm.xyz>
Date: Fri, 5 Sep 2025 00:47:56 -0400
Subject: [PATCH 2/4] chore: update Chen's rewording

Co-authored-by: Chen Zhang <zhangch99@outlook.com>
---
 OpenAI/GPT-OSS.md | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/OpenAI/GPT-OSS.md b/OpenAI/GPT-OSS.md
index 85a5d93..f834442 100644
--- a/OpenAI/GPT-OSS.md
+++ b/OpenAI/GPT-OSS.md
@@ -173,11 +173,10 @@ The URLs are expected to be MCP SSE servers that implement `instructions` in ser
 
 ### Function calling
 
-vLLM supports function calling for Chat Completion API. Make sure to run your gpt-oss models with the following:
+vLLM also supports calling user-defined functions. Make sure to run your gpt-oss models with the following arguments.
 
 ```bash
-vllm serve openai/gpt-oss-20b --tool-call-parser openai --reasoning-parser openai_gptoss --enable-auto-tool-choice
-```
+vllm serve ... --tool-call-parser openai --reasoning-parser openai_gptoss --enable-auto-tool-choice
 
 ## Accuracy Evaluation Panels
 

From d664785f155585add2f0cde73dc1816e55eda5ad Mon Sep 17 00:00:00 2001
From: Aaron Pham <contact@aarnphm.xyz>
Date: Fri, 5 Sep 2025 00:50:06 -0400
Subject: [PATCH 3/4] Update OpenAI/GPT-OSS.md

---
 OpenAI/GPT-OSS.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/OpenAI/GPT-OSS.md b/OpenAI/GPT-OSS.md
index f834442..a9c972b 100644
--- a/OpenAI/GPT-OSS.md
+++ b/OpenAI/GPT-OSS.md
@@ -176,7 +176,7 @@ The URLs are expected to be MCP SSE servers that implement `instructions` in ser
 vLLM also supports calling user-defined functions. Make sure to run your gpt-oss models with the following arguments.
 
 ```bash
-vllm serve ... --tool-call-parser openai --reasoning-parser openai_gptoss --enable-auto-tool-choice
+vllm serve ... --tool-call-parser openai --enable-auto-tool-choice
 
 ## Accuracy Evaluation Panels
 

From 14ed30849693560908b5a1922420fe351dacd547 Mon Sep 17 00:00:00 2001
From: Aaron Pham <contact@aarnphm.xyz>
Date: Fri, 5 Sep 2025 00:51:58 -0400
Subject: [PATCH 4/4] Revise gpt-oss documentation for clarity and updates

Updated installation and usage instructions for gpt-oss, including commands for setting up tool servers and running evaluations.
---
 OpenAI/GPT-OSS.md | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/OpenAI/GPT-OSS.md b/OpenAI/GPT-OSS.md
index a9c972b..4064c43 100644
--- a/OpenAI/GPT-OSS.md
+++ b/OpenAI/GPT-OSS.md
@@ -153,7 +153,7 @@ One premier feature of gpt-oss is the ability to call tools directly, called "bu
 
 * By default, we integrate with the reference library's browser (with `ExaBackend`) and demo Python interpreter via docker container. In order to use the search backend, you need to get access to [exa.ai](http://exa.ai) and put `EXA_API_KEY=` as an environment variable. For Python, either have docker available, or set `PYTHON_EXECUTION_BACKEND=dangerously_use_uv` to dangerously allow execution of model generated code snippets to be executed on the same machine. Please note that `PYTHON_EXECUTION_BACKEND=dangerously_use_uv` needs `gpt-oss>=0.0.5`.
 
-```
+```bash
 uv pip install gpt-oss
 
 vllm serve ... --tool-server demo
@@ -162,7 +162,7 @@ vllm serve ... --tool-server demo
 * Please note that the default options are simply for demo purposes. For production usage, vLLM itself can act as MCP client to multiple services. 
 Here is an [example tool server](https://github.com/openai/gpt-oss/tree/main/gpt-oss-mcp-server) that vLLM can work with, they wrap the demo tools: 
 
-```
+```bash
 mcp run -t sse browser_server.py:mcp
 mcp run -t sse python_server.py:mcp
 
@@ -177,6 +177,7 @@ vLLM also supports calling user-defined functions. Make sure to run your gpt-oss
 
 ```bash
 vllm serve ... --tool-call-parser openai --enable-auto-tool-choice
+```
 
 ## Accuracy Evaluation Panels
 
@@ -184,7 +185,7 @@ OpenAI recommends using the gpt-oss reference library to perform evaluation.
 
 First, deploy the model with vLLM:
 
-```
+```bash
 # Example deployment on 8xH100
 vllm serve openai/gpt-oss-120b \
   --tensor_parallel_size 8 \
@@ -197,7 +198,7 @@ vllm serve openai/gpt-oss-120b \
 
 Then, run the evaluation with gpt-oss. The following command will run all the 3 reasoning effort levels.
 
-```
+```bash
 mkdir -p /tmp/gpqa_openai
 OPENAI_API_KEY=empty python -m gpt_oss.evals --model openai/gpt-oss-120b --eval gpqa --n-threads 128
 ```