-
Notifications
You must be signed in to change notification settings - Fork 175
Open
Description
Description
I'm trying to deploy a gpt-oss-120b provisioned throughput endpoint via the sdk (0.69.0). I then run into the following error:
databricks.sdk.errors.platform.InvalidParameterValue: Served entity system.ai.gpt-oss-120b is not eligible for Provisioned Throughput.
However, deploying the foundation model via the UI works just fine.
Reproduction
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput, ServedEntityInput
w = WorkspaceClient()
w.serving_endpoints.create_and_wait(
name="pt_endpoint",
config=EndpointCoreConfigInput(
served_entities=[
ServedEntityInput(
name="foo",
entity_name="system.ai.gpt-oss-120b",
scale_to_zero_enabled=False,
entity_version=1,
min_provisioned_throughput=10,
max_provisioned_throughput=100,
)
]
)
)Expected behavior
The model serving endpoint is deployed.
Is it a regression?
Not that I know of. Tested some older versions, same result.
Debug Logs
INFO:databricks.sdk:loading DEFAULT profile from ~/.databrickscfg: host, token
DEBUG:databricks.sdk:Attempting to configure auth: pat
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): <workspace-url.com>:443
DEBUG:urllib3.connectionpool:<workspace-url.com>:443 "POST /api/2.0/serving-endpoints HTTP/11" 400 None
DEBUG:databricks.sdk:POST /api/2.0/serving-endpoints
> {
> "config": {
> "served_entities": [
> {
> "entity_name": "system.ai.gpt-oss-120b",
> "entity_version": 1,
> "max_provisioned_throughput": 100,
> "min_provisioned_throughput": 10,
> "name": "foo",
> "scale_to_zero_enabled": false
> }
> ]
> },
> "name": "pt_endpoint"
> }
< 400 Bad Request
< {
< "details": [
< {
< "@type": "type.googleapis.com/google.rpc.RequestInfo",
< "request_id": "3b1a97ca-27cb-4db7-8975-8e2104617a10",
< "serving_data": ""
< }
< ],
< "error_code": "INVALID_PARAMETER_VALUE",
< "message": "Served entity system.ai.gpt-oss-120b is not eligible for Provisioned Throughput."
< }
Traceback (most recent call last):
File "/Users/user.name/projects/deploy_pt/main.py", line 9, in <module>
w.serving_endpoints.create_and_wait(
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/service/serving.py", line 4157, in create_and_wait
return self.create(
^^^^^^^^^^^^
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/service/serving.py", line 4136, in create
op_response = self._api.do("POST", "/api/2.0/serving-endpoints", body=body, headers=headers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/core.py", line 85, in do
return self._api_client.do(
^^^^^^^^^^^^^^^^^^^^
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/_base_client.py", line 199, in do
response = call(
^^^^^
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/retries.py", line 59, in wrapper
raise err
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/retries.py", line 38, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/_base_client.py", line 301, in _perform
raise error from None
databricks.sdk.errors.platform.InvalidParameterValue: Served entity system.ai.gpt-oss-120b is not eligible for Provisioned Throughput.
Other Information
- OS: macOS
- Version: Sequoia 15.7.1
Additional context
theDarkDuke
Metadata
Metadata
Assignees
Labels
No labels