Skip to content

Commit ac04873

Browse files
Run gpt-oss locally in LM Studio (#2006)
Co-authored-by: Dominik Kundel <[email protected]>
1 parent e6b0116 commit ac04873

File tree

4 files changed

+213
-1
lines changed

4 files changed

+213
-1
lines changed
Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
# How to run gpt-oss locally with LM Studio
2+
3+
[LM Studio](https://lmstudio.ai) is a performant and friendly desktop application for running large language models (LLMs) on local hardware. This guide will walk you through how to set up and run **gpt-oss-20b** or **gpt-oss-120b** models using LM Studio, including how to chat with them, use MCP servers, or interact with the models through LM Studio's local development API.
4+
5+
Note that this guide is meant for consumer hardware, like running gpt-oss on a PC or Mac. For server applications with dedicated GPUs like NVIDIA's H100s, [check out our vLLM guide](https://cookbook.openai.com/articles/gpt-oss/run-vllm).
6+
7+
## Pick your model
8+
9+
LM Studio supports both model sizes of gpt-oss:
10+
11+
- [**`openai/gpt-oss-20b`**](https://lmstudio.ai/models/openai/gpt-oss-20b)
12+
- The smaller model
13+
- Only requires at least **16GB of VRAM**
14+
- Perfect for higher-end consumer GPUs or Apple Silicon Macs
15+
- [**`openai/gpt-oss-120b`**](https://lmstudio.ai/models/openai/gpt-oss-120b)
16+
- Our larger full-sized model
17+
- Best with **≥60GB VRAM**
18+
- Ideal for multi-GPU or beefy workstation setup
19+
20+
LM Studio ships both a [llama.cpp](https://github.com/ggml-org/llama.cpp) inferencing engine (running GGUF formatted models), as well as an [Apple MLX](https://github.com/ml-explore/mlx) engine for Apple Silicon Macs.
21+
22+
## Quick setup
23+
24+
1. **Install LM Studio**
25+
LM Studio is available for Windows, macOS, and Linux. [Get it here](https://lmstudio.ai/download).
26+
27+
2. **Download the gpt-oss model**
28+
29+
```shell
30+
# For 20B
31+
lms get openai/gpt-oss-20b
32+
# or for 120B
33+
lms get openai/gpt-oss-120b
34+
```
35+
36+
3. **Load the model in LM Studio**
37+
→ Open LM Studio and use the model loading interface to load the gpt-oss model you downloaded. Alternatively, you can use the command line:
38+
39+
```shell
40+
# For 20B
41+
lms load openai/gpt-oss-20b
42+
# or for 120B
43+
lms load openai/gpt-oss-120b
44+
```
45+
46+
4. **Use the model** → Once loaded, you can interact with the model directly in LM Studio's chat interface or through the API.
47+
48+
## Chat with gpt-oss
49+
50+
Use LM Studio's chat interface to start a conversation with gpt-oss, or use the `chat` command in the terminal:
51+
52+
```shell
53+
lms chat openai/gpt-oss-20b
54+
```
55+
56+
Note about prompt formatting: LM Studio utilizes OpenAI's [Harmony](https://cookbook.openai.com/articles/openai-harmony) library to construct the input to gpt-oss models, both when running via llama.cpp and MLX.
57+
58+
## Use gpt-oss with a local /v1/chat/completions endpoint
59+
60+
LM Studio exposes a **Chat Completions-compatible API** so you can use the OpenAI SDK without changing much. Here’s a Python example:
61+
62+
```py
63+
from openai import OpenAI
64+
65+
client = OpenAI(
66+
base_url="http://localhost:1234/v1",
67+
api_key="not-needed" # LM Studio does not require an API key
68+
)
69+
70+
result = client.chat.completions.create(
71+
model="openai/gpt-oss-20b",
72+
messages=[
73+
{"role": "system", "content": "You are a helpful assistant."},
74+
{"role": "user", "content": "Explain what MXFP4 quantization is."}
75+
]
76+
)
77+
78+
print(result.choices[0].message.content)
79+
```
80+
81+
If you’ve used the OpenAI SDK before, this will feel instantly familiar and your existing code should work by changing the base URL.
82+
83+
## How to use MCPs in the chat UI
84+
85+
LM Studio is an [MCP client](https://lmstudio.ai/docs/app/plugins/mcp), which means you can connect MCP servers to it. This allows you to provide external tools to gpt-oss models.
86+
87+
LM Studio's mcp.json file is located in:
88+
89+
```shell
90+
~/.lmstudio/mcp.json
91+
```
92+
93+
## Local tool use with gpt-oss in Python or TypeScript
94+
95+
LM Studio's SDK is available in both [Python](https://github.com/lmstudio-ai/lmstudio-python) and [TypeScript](https://github.com/lmstudio-ai/lmstudio-js). You can leverage the SDK to implement tool calling and local function execution with gpt-oss.
96+
97+
The way to achieve this is via the `.act()` call, which allows you to provide tools to the gpt-oss and have it go between calling tools and reasoning, until it completes your task.
98+
99+
The example below shows how to provide a single tool to the model that is able to create files on your local filesystem. You can use this example as a starting point, and extend it with more tools. See docs about tool definitions here for [Python](https://lmstudio.ai/docs/python/agent/tools) and [TypeScript](https://lmstudio.ai/docs/typescript/agent/tools).
100+
101+
```shell
102+
uv pip install lmstudio
103+
```
104+
105+
```python
106+
import readline # Enables input line editing
107+
from pathlib import Path
108+
109+
import lmstudio as lms
110+
111+
# Define a function that can be called by the model and provide them as tools to the model.
112+
# Tools are just regular Python functions. They can be anything at all.
113+
def create_file(name: str, content: str):
114+
"""Create a file with the given name and content."""
115+
dest_path = Path(name)
116+
if dest_path.exists():
117+
return "Error: File already exists."
118+
try:
119+
dest_path.write_text(content, encoding="utf-8")
120+
except Exception as exc:
121+
return "Error: {exc!r}"
122+
return "File created."
123+
124+
def print_fragment(fragment, round_index=0):
125+
# .act() supplies the round index as the second parameter
126+
# Setting a default value means the callback is also
127+
# compatible with .complete() and .respond().
128+
print(fragment.content, end="", flush=True)
129+
130+
model = lms.llm("openai/gpt-oss-20b")
131+
chat = lms.Chat("You are a helpful assistant running on the user's computer.")
132+
133+
while True:
134+
try:
135+
user_input = input("User (leave blank to exit): ")
136+
except EOFError:
137+
print()
138+
break
139+
if not user_input:
140+
break
141+
chat.add_user_message(user_input)
142+
print("Assistant: ", end="", flush=True)
143+
model.act(
144+
chat,
145+
[create_file],
146+
on_message=chat.append,
147+
on_prediction_fragment=print_fragment,
148+
)
149+
print()
150+
151+
```
152+
153+
For TypeScript developers who want to utilize gpt-oss locally, here's a similar example using `lmstudio-js`:
154+
155+
```shell
156+
npm install @lmstudio/sdk
157+
```
158+
159+
```typescript
160+
import { Chat, LMStudioClient, tool } from "@lmstudio/sdk";
161+
import { existsSync } from "fs";
162+
import { writeFile } from "fs/promises";
163+
import { createInterface } from "readline/promises";
164+
import { z } from "zod";
165+
166+
const rl = createInterface({ input: process.stdin, output: process.stdout });
167+
const client = new LMStudioClient();
168+
const model = await client.llm.model("openai/gpt-oss-20b");
169+
const chat = Chat.empty();
170+
171+
const createFileTool = tool({
172+
name: "createFile",
173+
description: "Create a file with the given name and content.",
174+
parameters: { name: z.string(), content: z.string() },
175+
implementation: async ({ name, content }) => {
176+
if (existsSync(name)) {
177+
return "Error: File already exists.";
178+
}
179+
await writeFile(name, content, "utf-8");
180+
return "File created.";
181+
},
182+
});
183+
184+
while (true) {
185+
const input = await rl.question("User: ");
186+
// Append the user input to the chat
187+
chat.append("user", input);
188+
189+
process.stdout.write("Assistant: ");
190+
await model.act(chat, [createFileTool], {
191+
// When the model finish the entire message, push it to the chat
192+
onMessage: (message) => chat.append(message),
193+
onPredictionFragment: ({ content }) => {
194+
process.stdout.write(content);
195+
},
196+
});
197+
process.stdout.write("\n");
198+
}
199+
```

articles/gpt-oss/run-vllm.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[vLLM](https://docs.vllm.ai/en/latest/) is an open-source, high-throughput inference engine designed to efficiently serve large language models (LLMs) by optimizing memory usage and processing speed. This guide will walk you through how to use vLLM to set up **gpt-oss-20b** or **gpt-oss-120b** on a server to serve gpt-oss as an API for your applications, and even connect it to the Agents SDK.
44

5-
Note that this guide is meant for server applications with dedicated GPUs like NVIDIA’s H100s. For local inference on consumer GPUs, [check out our Ollama guide](https://cookbook.openai.com/articles/gpt-oss/run-locally-ollama).
5+
Note that this guide is meant for server applications with dedicated GPUs like NVIDIA’s H100s. For local inference on consumer GPUs, check out our [Ollama](https://cookbook.openai.com/articles/gpt-oss/run-locally-ollama) or [LM Studio](https://cookbook.openai.com/articles/gpt-oss/run-locally-lmstudio) guides.
66

77
## Pick your model
88

authors.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,3 +451,8 @@ Vaibhavs10:
451451
name: "vb"
452452
website: "https://huggingface.co/reach-vb"
453453
avatar: "https://cdn-avatars.huggingface.co/v1/production/uploads/1655385361868-61b85ce86eb1f2c5e6233736.jpeg"
454+
455+
yagil:
456+
name: "Yagil Burowski"
457+
website: "https://x.com/yagilb"
458+
avatar: "https://avatars.lmstudio.com/profile-images/yagil"

registry.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,14 @@
44
# should build pages for, and indicates metadata such as tags, creation date and
55
# authors for each page.
66

7+
- title: How to run gpt-oss locally with LM Studio
8+
path: articles/run-locally-lmstudio.md
9+
date: 2025-08-07
10+
authors:
11+
- yagil
12+
tags:
13+
- gpt-oss
14+
- open-models
715
- title: GPT-5 Prompt Migration and Improvement Using the New Optimizer
816
path: examples/gpt-5/prompt-optimization-cookbook.ipynb
917
date: 2025-08-07

0 commit comments

Comments
 (0)