You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: openapi.yaml
+17-11Lines changed: 17 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -7925,17 +7925,17 @@ components:
7925
7925
Determinism is not guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.
7926
7926
x-oaiMeta:
7927
7927
beta: true
7928
-
service_tier:
7929
-
description: |
7930
-
Specifies the latency tier to use for processing the request. This parameter is relevant for customers subscribed to the scale tier service:
7931
-
- If set to 'auto', the system will utilize scale tier credits until they are exhausted.
7932
-
- If set to 'default', the request will be processed in the shared cluster.
7933
-
7934
-
When this parameter is set, the response body will include the `service_tier` utilized.
7935
-
type: string
7936
-
enum: ["auto", "default"]
7937
-
nullable: true
7938
-
default: null
7928
+
service_tier:
7929
+
description: |
7930
+
Specifies the latency tier to use for processing the request. This parameter is relevant for customers subscribed to the scale tier service:
7931
+
- If set to 'auto', the system will utilize scale tier credits until they are exhausted.
7932
+
- If set to 'default', the request will be processed in the shared cluster.
7933
+
7934
+
When this parameter is set, the response body will include the `service_tier` utilized.
7935
+
type: string
7936
+
enum: ["auto", "default"]
7937
+
nullable: true
7938
+
default: null
7939
7939
stop:
7940
7940
description: |
7941
7941
Up to 4 sequences where the API will stop generating further tokens.
@@ -8259,6 +8259,12 @@ components:
8259
8259
model:
8260
8260
type: string
8261
8261
description: The model to generate the completion.
8262
+
service_tier:
8263
+
description: The service tier used for processing the request. This field is only included if the `service_tier` parameter is specified in the request.
0 commit comments