You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Retrieves a list of all available Foundry Local models in the catalog.
181
+
Get a list of available Foundry Local models in the catalog.
182
182
183
183
**Response:**
184
184
185
185
-`models` (array)
186
-
List of model objects, each containing:
186
+
Array of model objects. Each model includes:
187
187
-`name`: The unique identifier for the model.
188
188
-`displayName`: A human-readable name for the model, often the same as the name.
189
-
-`providerType`: The type of provider hosting the model (e.g., AzureFoundry).
189
+
-`providerType`: The type of provider hosting the model (for example, AzureFoundry).
190
190
-`uri`: The resource URI pointing to the model's location in the registry.
191
191
-`version`: The version number of the model.
192
-
-`modelType`: The format or type of the model (e.g., ONNX).
192
+
-`modelType`: The format or type of the model (for example, ONNX).
193
193
-`promptTemplate`:
194
194
-`assistant`: The template for the assistant's response.
195
195
-`prompt`: The template for the user-assistant interaction.
196
196
-`publisher`: The entity or organization that published the model.
197
-
-`task`: The primary task the model is designed to perform (e.g., chat-completion).
197
+
-`task`: The primary task the model is designed to perform (for example, chatcompletion).
198
198
-`runtime`:
199
-
-`deviceType`: The type of hardware the model is designed to run on (e.g., CPU).
199
+
-`deviceType`: The type of hardware the model is designed to run on (for example, CPU).
200
200
-`executionProvider`: The execution provider used for running the model.
201
201
-`fileSizeMb`: The size of the model file in megabytes.
202
202
-`modelSettings`:
@@ -238,7 +238,7 @@ Registers an external model provider for use with Foundry Local.
238
238
239
239
### GET /openai/models
240
240
241
-
Retrieves all available models, including both local models and registered external models.
241
+
Get all available models, including local and registered external models.
242
242
243
243
**Response:**
244
244
@@ -254,7 +254,7 @@ Retrieves all available models, including both local models and registered exter
254
254
255
255
### GET /openai/load/{name}
256
256
257
-
Loads a model into memory for faster inference.
257
+
Load a model into memory for faster inference.
258
258
259
259
**URI Parameters:**
260
260
@@ -266,7 +266,7 @@ Loads a model into memory for faster inference.
266
266
-`unload` (boolean, optional)
267
267
Whether to automatically unload the model after idle time. Defaults to `true`.
268
268
-`ttl` (integer, optional)
269
-
Time to live in seconds. If greater than 0, overrides `unload` parameter.
269
+
Time to live in seconds. If it's greater than 0, this value overrides the`unload` parameter.
270
270
-`ep` (string, optional)
271
271
Execution provider to run this model. Supports: `"dml"`, `"cuda"`, `"qnn"`, `"cpu"`, `"webgpu"`.
272
272
If not specified, uses settings from `genai_config.json`.
@@ -279,13 +279,13 @@ Loads a model into memory for faster inference.
279
279
**Example:**
280
280
281
281
- Request URI
282
-
```
283
-
GET /openai/load/Phi-4-mini-instruct-generic-cpu?ttl=3600&ep=dml
284
-
```
282
+
```http
283
+
GET /openai/load/Phi-4-mini-instruct-generic-cpu?ttl=3600&ep=dml
284
+
```
285
285
286
286
### GET /openai/unload/{name}
287
287
288
-
Unloads a model from memory.
288
+
Unload a model from memory.
289
289
290
290
**URI Parameters:**
291
291
@@ -305,9 +305,9 @@ Unloads a model from memory.
305
305
**Example:**
306
306
307
307
- Request URI
308
-
```
309
-
GET /openai/unload/Phi-4-mini-instruct-generic-cpu?force=true
310
-
```
308
+
```http
309
+
GET /openai/unload/Phi-4-mini-instruct-generic-cpu?force=true
310
+
```
311
311
312
312
### GET /openai/unloadall
313
313
@@ -320,7 +320,7 @@ Unloads all models from memory.
320
320
321
321
### GET /openai/loadedmodels
322
322
323
-
Retrieves a list of currently loaded models.
323
+
Get the list of currently loaded models.
324
324
325
325
**Response:**
326
326
@@ -336,7 +336,7 @@ Retrieves a list of currently loaded models.
336
336
337
337
### GET /openai/getgpudevice
338
338
339
-
Retrieves the currently selected GPU device ID.
339
+
Get the current GPU device ID.
340
340
341
341
**Response:**
342
342
@@ -345,7 +345,7 @@ Retrieves the currently selected GPU device ID.
345
345
346
346
### GET /openai/setgpudevice/{deviceId}
347
347
348
-
Sets the active GPU device.
348
+
Set the active GPU device.
349
349
350
350
**URI Parameters:**
351
351
@@ -360,16 +360,16 @@ Sets the active GPU device.
360
360
**Example:**
361
361
362
362
- Request URI
363
-
```
364
-
GET /openai/setgpudevice/1
365
-
```
363
+
```http
364
+
GET /openai/setgpudevice/1
365
+
```
366
366
367
367
### POST /openai/download
368
368
369
-
Downloads a model to local storage.
369
+
Download a model to local storage.
370
370
371
371
> [!NOTE]
372
-
> Model downloads can take significant time, especially for large models. We recommend setting a high timeout for this request to avoid premature termination.
372
+
> Large model downloads can take a long time. Set a high timeout for this request to avoid early termination.
373
373
374
374
**Request Body:**
375
375
@@ -379,11 +379,11 @@ Downloads a model to local storage.
379
379
-`Name` (string)
380
380
The model name.
381
381
-`ProviderType` (string, optional)
382
-
The provider type (e.g., `"AzureFoundryLocal"`,`"HuggingFace"`).
382
+
The provider type (for example, `"AzureFoundryLocal"`,`"HuggingFace"`).
383
383
-`Path` (string, optional)
384
-
The remote path where the model is located stored. For example, in a Hugging Face repository, this would be the path to the model files.
384
+
Remote path to the model files. For example, in a Hugging Face repository, this is the path to the model files.
0 commit comments