Skip to content

Provider Header-Based Credentials Fail Due to Validator Type Casting #4370

@asimurka

Description

@asimurka

System Info

Llama-stack version: 0.3.4

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

We have a similar problem with the Azure provider as the one we once discussed here: #4058
Namely that if azure_api_key is specified in the provider data header of llama stack client, it is cast to SecretStr by the validator but later the secret is not unwrapped.

Here are the relevant parts of the code:

# llama_stack.providers.utils.inference.openai_mixin.py
class OpenAIMixin(NeedsRequestProviderData, ABC, BaseModel):
...
@property
    def client(self) -> AsyncOpenAI:
        """
        Get an AsyncOpenAI client instance.

        Uses the abstract methods get_api_key() and get_base_url() which must be
        implemented by child classes.

        Users can also provide the API key via the provider data header, which
        is used instead of any config API key.
        """

        api_key = self._get_api_key_from_config_or_provider_data()
        if not api_key:
            message = "API key not provided."
            if self.provider_data_api_key_field:
                message += f' Please provide a valid API key in the provider data header, e.g. x-llamastack-provider-data: {{"{self.provider_data_api_key_field}": "<API_KEY>"}}.'
            raise ValueError(message)

        return AsyncOpenAI(
            api_key=api_key,
            base_url=self.get_base_url(),  # <--- url is correctly unwrapped
            **self.get_extra_client_params(),
        )

    def _get_api_key_from_config_or_provider_data(self) -> str | None:
        api_key = self.get_api_key()
        if self.provider_data_api_key_field:
            provider_data = self.get_request_provider_data()
            if provider_data and getattr(provider_data, self.provider_data_api_key_field, None):
                api_key = getattr(provider_data, self.provider_data_api_key_field)  # <--- the secret should be unwrapped with 'get_secret_value()'

        return api_key

Azure validator that validates attributes passed by provider header and casts them to SecretStr and HttpUrl (url is unwrapped correctly):

# llama_stack.providers.remote.inference.azure.config.py
class AzureProviderDataValidator(BaseModel):
    azure_api_key: SecretStr = Field(  # <--- azure_api_key is cast to SecretStr but isn't unwrapped later
        description="Azure API key for Azure",
    )
    azure_api_base: HttpUrl = Field(
        description="Azure API base for Azure (e.g., https://your-resource-name.openai.azure.com)",
    )
    azure_api_version: str | None = Field(
        default=None,
        description="Azure API version for Azure (e.g., 2024-06-01)",
    )
    azure_api_type: str | None = Field(
        default="azure",
        description="Azure API type for Azure (e.g., azure)",
    )

In summary, the method _get_api_key_from_config_or_provider_datagets the OpenAI client attributes. If an azure api key is present in the provider header, it takes precedence over the api key from the config, which is crucial for our implementation. However, the api key from the provider header is parsed into a Secret string by the validator but is not unwrapped again, which causes an error. If I add unpacking using get_secret_value() to the code, everything works correctly.

Error logs

ERROR    2025-12-10 14:00:08,236 llama_stack.core.server.server:290 core::server: Error executing endpoint              
         route='/v1/responses' method='post'                                                                            
         ╭───────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────╮
         │ /home/****/.venv/lib/python3.12/site-packages/llama_stack/core/server/serv │
         │ er.py:280 in route_handler                                                                                  │
         │                                                                                                             │
         │   277 │   │   │   │   │   return StreamingResponse(gen, media_type="text/event-stream")                     │
         │   278 │   │   │   │   else:                                                                                 │
         │   279 │   │   │   │   │   value = func(**kwargs)                                                            │
         │ ❱ 280 │   │   │   │   │   result = await maybe_await(value)                                                 │
         │   281 │   │   │   │   │   if isinstance(result, PaginatedResponse) and result.url is None:                  │
         │   282 │   │   │   │   │   │   result.url = route                                                            │
         │   283                                                                                                       │
         │                                                                                                             │
         │ /home/****/.venv/lib/python3.12/site-packages/llama_stack/core/server/serv │
         │ er.py:202 in maybe_await                                                                                    │
         │                                                                                                             │
         │   199                                                                                                       │
         │   200 async def maybe_await(value):                                                                         │
         │   201 │   if inspect.iscoroutine(value):                                                                    │
         │ ❱ 202 │   │   return await value                                                                            │
         │   203 │   return value                                                                                      │
         │   204                                                                                                       │
         │   205                                                                                                       │
         │                                                                                                             │
         │ /home/****/.venv/lib/python3.12/site-packages/llama_stack/providers/inline │
         │ /agents/meta_reference/agents.py:344 in create_openai_response                                              │
         │                                                                                                             │
         │   341 │   │   max_infer_iters: int | None = 10,                                                             │
         │   342 │   │   guardrails: list[ResponseGuardrail] | None = None,                                            │
         │   343 │   ) -> OpenAIResponseObject:                                                                        │
         │ ❱ 344 │   │   return await self.openai_responses_impl.create_openai_response(                               │
         │   345 │   │   │   input,                                                                                    │
         │   346 │   │   │   model,                                                                                    │
         │   347 │   │   │   instructions,                                                                             │
         │                                                                                                             │
         │ /home/****/.venv/lib/python3.12/site-packages/llama_stack/providers/inline │
         │ /agents/meta_reference/responses/openai_responses.py:324 in create_openai_response                          │
         │                                                                                                             │
         │   321 │   │   │   │   │   if failed_response and failed_response.error                                      │
         │   322 │   │   │   │   │   else "Response stream failed without error details"                               │
         │   323 │   │   │   │   )                                                                                     │
         │ ❱ 324 │   │   │   │   raise RuntimeError(f"OpenAI response failed: {error_message}")                        │
         │   325 │   │   │                                                                                             │
         │   326 │   │   │   if final_response is None:                                                                │
         │   327 │   │   │   │   raise ValueError("The response stream never reached a terminal state")                │
         ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
         RuntimeError: OpenAI response failed: Error code: 401 - {'error': {'code': '401', 'message': 'Access denied due
         to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription 
         and use a correct regional API endpoint for your resource.'}}                                                  

Expected behavior

AsyncOpenAI client correctly unwrapps api key from the validated secret string in _get_api_key_from_config_or_provider_data method.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions