Skip to content

Commit d8b8efe

Browse files
committed
Refreshes reference-rest.md
1 parent 2de1d5c commit d8b8efe

File tree

1 file changed

+42
-42
lines changed

1 file changed

+42
-42
lines changed

articles/ai-foundry/foundry-local/reference/reference-rest.md

Lines changed: 42 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ ms.reviewer: samkemp
1010
author: jonburchel
1111
reviewer: samuel100
1212
ms.topic: concept-article
13-
ms.date: 07/03/2025
13+
ms.date: 10/01/2025
1414
---
1515

1616
# Foundry Local REST API Reference
@@ -25,7 +25,7 @@ ms.date: 07/03/2025
2525
### POST /v1/chat/completions
2626

2727
This endpoint processes chat completion requests.
28-
Fully compatible with the [OpenAI Chat Completions API](https://platform.openai.com/docs/api-reference/chat/create)
28+
It's fully compatible with the [OpenAI Chat Completions API](https://platform.openai.com/docs/api-reference/chat/create).
2929

3030
**Request Body:**
3131

@@ -52,8 +52,8 @@ _---Standard OpenAI Properties---_
5252
Up to 4 sequences that will cause the model to stop generating further tokens.
5353
- `max_tokens` (integer, optional)
5454
Maximum number of tokens to generate. For newer models, use `max_completion_tokens` instead.
55-
- `max_completion_token` (integer, optional)
56-
Maximum token limit for generation, including both visible output and reasoning tokens.
55+
- `max_completion_tokens` (integer, optional)
56+
Maximum number of tokens the model can generate, including visible output and reasoning tokens.
5757
- `presence_penalty` (number, optional)
5858
Value between -2.0 and 2.0. Positive values encourage the model to discuss new topics by penalizing tokens that have already appeared.
5959
- `frequency_penalty` (number, optional)
@@ -73,7 +73,7 @@ _---Standard OpenAI Properties---_
7373
Function parameters described as a JSON Schema object.
7474
- `function_call` (string or object, optional)
7575
Controls how the model responds to function calls.
76-
- If object, may include:
76+
- If object, can include:
7777
- `name` (string, optional)
7878
The name of the function to call.
7979
- `arguments` (object, optional)
@@ -178,25 +178,25 @@ _---Standard OpenAI Properties---_
178178

179179
### GET /foundry/list
180180

181-
Retrieves a list of all available Foundry Local models in the catalog.
181+
Get a list of available Foundry Local models in the catalog.
182182

183183
**Response:**
184184

185185
- `models` (array)
186-
List of model objects, each containing:
186+
Array of model objects. Each model includes:
187187
- `name`: The unique identifier for the model.
188188
- `displayName`: A human-readable name for the model, often the same as the name.
189-
- `providerType`: The type of provider hosting the model (e.g., AzureFoundry).
189+
- `providerType`: The type of provider hosting the model (for example, AzureFoundry).
190190
- `uri`: The resource URI pointing to the model's location in the registry.
191191
- `version`: The version number of the model.
192-
- `modelType`: The format or type of the model (e.g., ONNX).
192+
- `modelType`: The format or type of the model (for example, ONNX).
193193
- `promptTemplate`:
194194
- `assistant`: The template for the assistant's response.
195195
- `prompt`: The template for the user-assistant interaction.
196196
- `publisher`: The entity or organization that published the model.
197-
- `task`: The primary task the model is designed to perform (e.g., chat-completion).
197+
- `task`: The primary task the model is designed to perform (for example, chat completion).
198198
- `runtime`:
199-
- `deviceType`: The type of hardware the model is designed to run on (e.g., CPU).
199+
- `deviceType`: The type of hardware the model is designed to run on (for example, CPU).
200200
- `executionProvider`: The execution provider used for running the model.
201201
- `fileSizeMb`: The size of the model file in megabytes.
202202
- `modelSettings`:
@@ -238,7 +238,7 @@ Registers an external model provider for use with Foundry Local.
238238

239239
### GET /openai/models
240240

241-
Retrieves all available models, including both local models and registered external models.
241+
Get all available models, including local and registered external models.
242242

243243
**Response:**
244244

@@ -254,7 +254,7 @@ Retrieves all available models, including both local models and registered exter
254254

255255
### GET /openai/load/{name}
256256

257-
Loads a model into memory for faster inference.
257+
Load a model into memory for faster inference.
258258

259259
**URI Parameters:**
260260

@@ -266,7 +266,7 @@ Loads a model into memory for faster inference.
266266
- `unload` (boolean, optional)
267267
Whether to automatically unload the model after idle time. Defaults to `true`.
268268
- `ttl` (integer, optional)
269-
Time to live in seconds. If greater than 0, overrides `unload` parameter.
269+
Time to live in seconds. If it's greater than 0, this value overrides the `unload` parameter.
270270
- `ep` (string, optional)
271271
Execution provider to run this model. Supports: `"dml"`, `"cuda"`, `"qnn"`, `"cpu"`, `"webgpu"`.
272272
If not specified, uses settings from `genai_config.json`.
@@ -279,13 +279,13 @@ Loads a model into memory for faster inference.
279279
**Example:**
280280

281281
- Request URI
282-
```
283-
GET /openai/load/Phi-4-mini-instruct-generic-cpu?ttl=3600&ep=dml
284-
```
282+
```http
283+
GET /openai/load/Phi-4-mini-instruct-generic-cpu?ttl=3600&ep=dml
284+
```
285285

286286
### GET /openai/unload/{name}
287287

288-
Unloads a model from memory.
288+
Unload a model from memory.
289289

290290
**URI Parameters:**
291291

@@ -305,9 +305,9 @@ Unloads a model from memory.
305305
**Example:**
306306

307307
- Request URI
308-
```
309-
GET /openai/unload/Phi-4-mini-instruct-generic-cpu?force=true
310-
```
308+
```http
309+
GET /openai/unload/Phi-4-mini-instruct-generic-cpu?force=true
310+
```
311311

312312
### GET /openai/unloadall
313313

@@ -320,7 +320,7 @@ Unloads all models from memory.
320320

321321
### GET /openai/loadedmodels
322322

323-
Retrieves a list of currently loaded models.
323+
Get the list of currently loaded models.
324324

325325
**Response:**
326326

@@ -336,7 +336,7 @@ Retrieves a list of currently loaded models.
336336

337337
### GET /openai/getgpudevice
338338

339-
Retrieves the currently selected GPU device ID.
339+
Get the current GPU device ID.
340340

341341
**Response:**
342342

@@ -345,7 +345,7 @@ Retrieves the currently selected GPU device ID.
345345

346346
### GET /openai/setgpudevice/{deviceId}
347347

348-
Sets the active GPU device.
348+
Set the active GPU device.
349349

350350
**URI Parameters:**
351351

@@ -360,16 +360,16 @@ Sets the active GPU device.
360360
**Example:**
361361

362362
- Request URI
363-
```
364-
GET /openai/setgpudevice/1
365-
```
363+
```http
364+
GET /openai/setgpudevice/1
365+
```
366366

367367
### POST /openai/download
368368

369-
Downloads a model to local storage.
369+
Download a model to local storage.
370370

371371
> [!NOTE]
372-
> Model downloads can take significant time, especially for large models. We recommend setting a high timeout for this request to avoid premature termination.
372+
> Large model downloads can take a long time. Set a high timeout for this request to avoid early termination.
373373
374374
**Request Body:**
375375

@@ -379,11 +379,11 @@ Downloads a model to local storage.
379379
- `Name` (string)
380380
The model name.
381381
- `ProviderType` (string, optional)
382-
The provider type (e.g., `"AzureFoundryLocal"`,`"HuggingFace"`).
382+
The provider type (for example, `"AzureFoundryLocal"`, `"HuggingFace"`).
383383
- `Path` (string, optional)
384-
The remote path where the model is located stored. For example, in a Hugging Face repository, this would be the path to the model files.
384+
Remote path to the model files. For example, in a Hugging Face repository, this is the path to the model files.
385385
- `PromptTemplate` (`Dictionary<string, string>`, optional)
386-
Contains:
386+
Includes:
387387
- `system` (string, optional)
388388
The template for the system message.
389389
- `user` (string, optional)
@@ -410,7 +410,7 @@ Downloads a model to local storage.
410410

411411
During download, the server streams progress updates in the format:
412412

413-
```
413+
```text
414414
("file name", percentage_complete)
415415
```
416416

@@ -444,14 +444,14 @@ During download, the server streams progress updates in the format:
444444

445445
- Response stream
446446

447-
```
448-
("genai_config.json", 0.01)
449-
("genai_config.json", 0.2)
450-
("model.onnx.data", 0.5)
451-
("model.onnx.data", 0.78)
452-
...
453-
("", 1)
454-
```
447+
```text
448+
("genai_config.json", 0.01)
449+
("genai_config.json", 0.2)
450+
("model.onnx.data", 0.5)
451+
("model.onnx.data", 0.78)
452+
...
453+
("", 1)
454+
```
455455

456456
- Final response
457457
```json
@@ -463,7 +463,7 @@ During download, the server streams progress updates in the format:
463463

464464
### GET /openai/status
465465

466-
Retrieves server status information.
466+
Get server status information.
467467

468468
**Response body:**
469469

0 commit comments

Comments
 (0)