You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/reference-model-inference-chat-completions.md
+31-14Lines changed: 31 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,14 @@ POST /chat/completions?api-version=2024-04-01-preview
29
29
| --- | --- | --- | --- | --- |
30
30
| api-version | query | True | string | The version of the API in the format "YYYY-MM-DD" or "YYYY-MM-DD-preview". |
31
31
32
+
## Request Header
33
+
34
+
35
+
| Name | Required | Type | Description |
36
+
| --- | --- | --- | --- |
37
+
| extra-parameters || string | The behavior of the API when extra parameters are indicated in the payload. Using `pass-through` makes the API to pass the parameter to the underlying model. Use this value when you want to pass parameters that you know the underlying model can support. Using `ignore` makes the API to drop any unsupported parameter. Use this value when you need to use the same payload across different models, but one of the extra parameters may make a model to error out if not supported. Using `error` makes the API to reject any extra parameter in the payload. Only parameters specified in this API can be indicated, or a 400 error is returned. |
38
+
| azureml-model-deployment || string | Name of the deployment you want to route the request to. Supported for endpoints that support multiple deployments. |
39
+
32
40
## Request Body
33
41
34
42
| Name | Required | Type | Description |
@@ -112,7 +120,7 @@ POST /chat/completions?api-version=2024-04-01-preview
112
120
"stream": false,
113
121
"temperature": 0,
114
122
"top_p": 1,
115
-
"response_format": "text"
123
+
"response_format": { "type": "text" }
116
124
}
117
125
```
118
126
@@ -156,7 +164,8 @@ Status code: 200
156
164
|[ChatCompletionFinishReason](#chatcompletionfinishreason)| The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, `length` if the maximum number of tokens specified in the request was reached, `content_filter` if content was omitted due to a flag from our content filters, `tool_calls` if the model called a tool. |
|[ChatCompletionResponseFormat](#chatcompletionresponseformat)| The response format for the model response. Setting to `json_object` enables JSON mode, which guarantees the message the model generates is valid JSON. When using JSON mode, you **must** also instruct the model to produce JSON yourself via a system or user message. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length. |
168
+
|[ChatCompletionResponseFormatType](#chatcompletionresponseformattype)| The response format type. |
160
169
|[ChatCompletionResponseMessage](#chatcompletionresponsemessage)| A chat completion message generated by the model. |
161
170
|[ChatCompletionTool](#chatcompletiontool)||
162
171
|[ChatMessageRole](#chatmessagerole)| The role of the author of this message. |
@@ -165,15 +174,15 @@ Status code: 200
165
174
|[ContentFilterError](#contentfiltererror)| The API call fails when the prompt triggers a content filter as configured. Modify the prompt and try again. |
|[CreateChatCompletionResponse](#createchatcompletionresponse)| Represents a chat completion response returned by model, based on the provided input. |
168
-
|[Detail](#detail)||
177
+
|[Detail](#detail)|Details for the [UnprocessableContentError](#unprocessablecontenterror) error.|
169
178
|[Function](#function)| The function that the model called. |
170
-
|[FunctionObject](#functionobject)||
179
+
|[FunctionObject](#functionobject)|Definition of a function the model has access to.|
171
180
|[ImageDetail](#imagedetail)| Specifies the detail level of the image. |
172
-
|[NotFoundError](#notfounderror)||
181
+
|[NotFoundError](#notfounderror)|The route is not valid for the deployed model.|
173
182
|[ToolType](#tooltype)| The type of the tool. Currently, only `function` is supported. |
|[TooManyRequestsError](#toomanyrequestserror)|You have hit your assigned rate limit and your requests need to be paced. |
184
+
|[UnauthorizedError](#unauthorizederror)|Authentication is missing or invalid.|
185
+
|[UnprocessableContentError](#unprocessablecontenterror)|The request contains unprocessable content. The error is returned when the payload indicated is valid according to this specification. However, some of the instructions indicated in the payload are not supported by the underlying model. Use the `details` section to understand the offending parameter.|
177
186
178
187
179
188
### ChatCompletionFinishReason
@@ -208,6 +217,15 @@ The object type, which is always `chat.completion`.
208
217
209
218
### ChatCompletionResponseFormat
210
219
220
+
The response format for the model response. Setting to `json_object` enables JSON mode, which guarantees the message the model generates is valid JSON. When using JSON mode, you **must** also instruct the model to produce JSON yourself via a system or user message. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length.
221
+
222
+
| Name | Type | Description |
223
+
| --- | --- | --- |
224
+
| type |[ChatCompletionResponseFormatType](#chatcompletionresponseformattype)| The response format type. |
225
+
226
+
### ChatCompletionResponseFormatType
227
+
228
+
The response format type.
211
229
212
230
| Name | Type | Description |
213
231
| --- | --- | --- |
@@ -236,7 +254,6 @@ A chat completion message generated by the model.
236
254
237
255
The role of the author of this message.
238
256
239
-
240
257
| Name | Type | Description |
241
258
| --- | --- | --- |
242
259
| assistant | string ||
@@ -248,7 +265,6 @@ The role of the author of this message.
248
265
249
266
A list of chat completion choices. Can be more than one if `n` is greater than 1.
250
267
251
-
252
268
| Name | Type | Description |
253
269
| --- | --- | --- |
254
270
| finish\_reason |[ChatCompletionFinishReason](#chatcompletionfinishreason)| The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, `length` if the maximum number of tokens specified in the request was reached, `content_filter` if content was omitted due to a flag from our content filters, `tool_calls` if the model called a tool. |
@@ -281,7 +297,6 @@ The API call fails when the prompt triggers a content filter as configured. Modi
281
297
282
298
### CreateChatCompletionRequest
283
299
284
-
285
300
| Name | Type | Default Value | Description |
286
301
| --- | --- | --- | --- |
287
302
| frequency\_penalty | number | 0 | Helps prevent word repetitions by reducing the chance of a word being selected if it has already been used. The higher the frequency penalty, the less likely the model is to repeat the same words in its output. Return a 422 error if value or parameter is not supported by model. |
@@ -347,7 +362,6 @@ Specifies the detail level of the image.
347
362
348
363
Represents a chat completion response returned by model, based on the provided input.
349
364
350
-
351
365
| Name | Type | Description |
352
366
| --- | --- | --- |
353
367
| choices |[Choices](#choices)\[\]| A list of chat completion choices. Can be more than one if `n` is greater than 1. |
@@ -360,6 +374,7 @@ Represents a chat completion response returned by model, based on the provided i
360
374
361
375
### Detail
362
376
377
+
Details for the [UnprocessableContentError](#unprocessablecontenterror) error.
363
378
364
379
| Name | Type | Description |
365
380
| --- | --- | --- |
@@ -370,14 +385,14 @@ Represents a chat completion response returned by model, based on the provided i
370
385
371
386
The function that the model called.
372
387
373
-
374
388
| Name | Type | Description |
375
389
| --- | --- | --- |
376
390
| arguments | string | The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may generate incorrect parameters not defined by your function schema. Validate the arguments in your code before calling your function. |
377
391
| name | string | The name of the function to call. |
378
392
379
393
### FunctionObject
380
394
395
+
Definition of a function the model has access to.
381
396
382
397
| Name | Type | Description |
383
398
| --- | --- | --- |
@@ -406,6 +421,7 @@ The type of the tool. Currently, only `function` is supported.
406
421
### TooManyRequestsError
407
422
408
423
424
+
409
425
| Name | Type | Description |
410
426
| --- | --- | --- |
411
427
| error | string | The error description. |
@@ -423,11 +439,12 @@ The type of the tool. Currently, only `function` is supported.
423
439
424
440
### UnprocessableContentError
425
441
442
+
The request contains unprocessable content. The error is returned when the payload indicated is valid according to this specification. However, some of the instructions indicated in the payload are not supported by the underlying model. Use the `details` section to understand the offending parameter.
0 commit comments