Skip to content

[ISSUE] cannot deploy foundation models provisioned throughput endpoint via sdkΒ #1081

@marianreuss

Description

@marianreuss

Description
I'm trying to deploy a gpt-oss-120b provisioned throughput endpoint via the sdk (0.69.0). I then run into the following error:

databricks.sdk.errors.platform.InvalidParameterValue: Served entity system.ai.gpt-oss-120b is not eligible for Provisioned Throughput.

However, deploying the foundation model via the UI works just fine.

Reproduction

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import EndpointCoreConfigInput, ServedEntityInput

w = WorkspaceClient()

w.serving_endpoints.create_and_wait(
    name="pt_endpoint",
    config=EndpointCoreConfigInput(
        served_entities=[
            ServedEntityInput(
                name="foo",
                entity_name="system.ai.gpt-oss-120b",
                scale_to_zero_enabled=False,
                entity_version=1,
                min_provisioned_throughput=10,
                max_provisioned_throughput=100,
        )
        ]
    )
)

Expected behavior
The model serving endpoint is deployed.

Is it a regression?
Not that I know of. Tested some older versions, same result.

Debug Logs

INFO:databricks.sdk:loading DEFAULT profile from ~/.databrickscfg: host, token
DEBUG:databricks.sdk:Attempting to configure auth: pat
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): <workspace-url.com>:443
DEBUG:urllib3.connectionpool:<workspace-url.com>:443 "POST /api/2.0/serving-endpoints HTTP/11" 400 None
DEBUG:databricks.sdk:POST /api/2.0/serving-endpoints
> {
>   "config": {
>     "served_entities": [
>       {
>         "entity_name": "system.ai.gpt-oss-120b",
>         "entity_version": 1,
>         "max_provisioned_throughput": 100,
>         "min_provisioned_throughput": 10,
>         "name": "foo",
>         "scale_to_zero_enabled": false
>       }
>     ]
>   },
>   "name": "pt_endpoint"
> }
< 400 Bad Request
< {
<   "details": [
<     {
<       "@type": "type.googleapis.com/google.rpc.RequestInfo",
<       "request_id": "3b1a97ca-27cb-4db7-8975-8e2104617a10",
<       "serving_data": ""
<     }
<   ],
<   "error_code": "INVALID_PARAMETER_VALUE",
<   "message": "Served entity system.ai.gpt-oss-120b is not eligible for Provisioned Throughput."
< }
Traceback (most recent call last):
  File "/Users/user.name/projects/deploy_pt/main.py", line 9, in <module>
    w.serving_endpoints.create_and_wait(
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/service/serving.py", line 4157, in create_and_wait
    return self.create(
           ^^^^^^^^^^^^
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/service/serving.py", line 4136, in create
    op_response = self._api.do("POST", "/api/2.0/serving-endpoints", body=body, headers=headers)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/core.py", line 85, in do
    return self._api_client.do(
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/_base_client.py", line 199, in do
    response = call(
               ^^^^^
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/retries.py", line 59, in wrapper
    raise err
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/retries.py", line 38, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user.name/miniconda3/lib/python3.12/site-packages/databricks/sdk/_base_client.py", line 301, in _perform
    raise error from None
databricks.sdk.errors.platform.InvalidParameterValue: Served entity system.ai.gpt-oss-120b is not eligible for Provisioned Throughput.

Other Information

  • OS: macOS
  • Version: Sequoia 15.7.1

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions