feat: add forward_headers support to inference passthrough provider by skamenan7 · Pull Request #5134 · llamastack/llama-stack

skamenan7 · 2026-03-13T14:59:50Z

What does this PR do?

Adds per-request HTTP header forwarding to the remote::passthrough inference provider, following the pattern established by the safety passthrough provider (PR #5004, already merged).

A forward_headers config field maps provider-data keys to outbound HTTP header names. Only explicitly listed keys are forwarded from X-LlamaStack-Provider-Data to the downstream service (default-deny). An extra_blocked_headers field lets operators add custom blocked names on top of the core security list.

The shared utility providers/utils/forward_headers.py is used by both the inference and safety passthrough providers, keeping the forwarding logic and blocked-header policy in one place.

Closes #5040
Relates #4607

Test Plan

Unit tests cover the full path — config validation, header extraction, CRLF sanitization, blocked-header enforcement, auth priority chain, and concurrent request isolation:

uv run pytest tests/unit/providers/inference/test_passthrough_forward_headers.py -v

Tests cover:

build_forwarded_headers() — key mapping, default-deny, CRLF stripping, SecretStr unwrap, case-insensitive dedup
validate_forward_headers_config() — blocked header rejection, operator extra blocklist, invalid names
Adapter auth priority — static api_key > passthrough_api_key > forwarded Authorization
Provider data validator — extra fields preserved for forwarding, reserved keys rejected
Concurrent request isolation — contextvars don't leak between parallel requests

Also tested end-to-end locally against a mock inference server and a mock /v1/moderations server. Headers land on the downstream exactly as configured and blocked headers are rejected at stack startup, not at request time.

Example config:

providers:
  inference:
    - provider_id: maas-inference
      provider_type: remote::passthrough
      config:
        base_url: ${env.PASSTHROUGH_URL}
        forward_headers:
          maas_api_token: "Authorization"
          tenant_id: "X-Tenant-ID"

Callers pass credentials via X-LlamaStack-Provider-Data:

curl http://localhost:8321/v1/chat/completions \
  -H 'X-LlamaStack-Provider-Data: {"maas_api_token": "Bearer user-jwt", "tenant_id": "acme"}' \
  -d '{"model": "passthrough/my-model", "messages": [{"role": "user", "content": "hello"}]}'

The downstream receives Authorization: Bearer user-jwt and X-Tenant-ID: acme. Only keys explicitly listed in forward_headers are forwarded to the downstream service. Any keys in X-LlamaStack-Provider-Data that don't have a mapping in forward_headers are ignored — they never leave the stack. This is the default-deny policy: if it's not in the config, it doesn't get forwarded.

skamenan7 · 2026-03-13T15:49:44Z

cc: @leseb I have addressed your safety pr #5004 here and refactored to a common utility so this functionality can be easily reused for follow on PRs for other providers as per your suggestion. Thanks!

ps: I can also open another github issue and pr for safety passthrough #5004 if keeping them separate from this PR makes sense.

uv.lock

src/llama_stack/providers/utils/forward_headers.py

tests/integration/safety/test_passthrough.py

cdoern

looks pretty good. one question

src/llama_stack/providers/utils/forward_headers.py

cdoern

looks reasonable now, one question

src/llama_stack/providers/utils/forward_headers.py

leseb · 2026-03-23T13:08:52Z

@skamenan7 unit tests are failing

src/llama_stack/providers/remote/inference/passthrough/__init__.py

skamenan7 · 2026-03-23T15:03:57Z

CI Status / ci-status (pull_request)Failing after 7m RequiredMore actions

#### Unit Tests / unit-tests (3.12) (pull_request)Failing after 1m

@leseb looks like the unit-tests (3.12) failure is a pre-existing flake in test_remote_vllm.py::test_openai_chat_completion_is_async, unrelated to this PR. It's a timing-sensitive test that runs 4 parallel 0.5s coroutines and asserts the total time is <1.0s — on a loaded CI runner it occasionally fails over. The test passes locally consistently and isn't in our diff.

…nd extra_blocked_headers

…t of codebase

…update

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 13, 2026

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 2 times, most recently from 699c2ff to b25b54a Compare March 13, 2026 15:41

skamenan7 marked this pull request as ready for review March 13, 2026 15:45

skamenan7 requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners March 13, 2026 15:45

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 2 times, most recently from 6c5a64b to f8ecbf6 Compare March 13, 2026 21:30

This was referenced Mar 13, 2026

feat: support forwarding extra headers from provider data to upstream endpoints #5100

Closed

feat: MCP tool runtime auth header passthrough with admin-configured header names #5152

Open

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 4 times, most recently from 58ae1c4 to 18fa1ad Compare March 16, 2026 21:00

skamenan7 mentioned this pull request Mar 17, 2026

bug: blocking sync requests calls in async methods of safety/nvidia and eval/nvidia providers #5178

Closed

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 2 times, most recently from c3fd868 to 714278a Compare March 17, 2026 21:37

leseb requested changes Mar 18, 2026

View reviewed changes

uv.lock Outdated Show resolved Hide resolved

src/llama_stack/providers/utils/forward_headers.py Show resolved Hide resolved

tests/integration/safety/test_passthrough.py Show resolved Hide resolved

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 2 times, most recently from 907a43c to cfc16a8 Compare March 18, 2026 11:15

skamenan7 requested a review from leseb March 18, 2026 11:29

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch from 952ca64 to d4097a6 Compare March 18, 2026 11:33

cdoern reviewed Mar 18, 2026

View reviewed changes

src/llama_stack/providers/utils/forward_headers.py Show resolved Hide resolved

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 7 times, most recently from c1f918b to 03d8dcb Compare March 19, 2026 19:46

cdoern reviewed Mar 20, 2026

View reviewed changes

src/llama_stack/providers/utils/forward_headers.py Outdated Show resolved Hide resolved

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 3 times, most recently from f838254 to e18e4f4 Compare March 20, 2026 17:16

skamenan7 requested a review from cdoern March 20, 2026 17:52

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 2 times, most recently from 5c9b9be to 5c8c7e1 Compare March 23, 2026 11:49

leseb reviewed Mar 23, 2026

View reviewed changes

src/llama_stack/providers/remote/inference/passthrough/__init__.py Outdated Show resolved Hide resolved

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch from 5c8c7e1 to 5b35b2e Compare March 23, 2026 13:31

skamenan7 requested a review from leseb March 23, 2026 14:20

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch from 9bf5003 to 1b67e61 Compare March 23, 2026 14:31

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch 3 times, most recently from 850056a to cd8018e Compare March 23, 2026 16:01

skamenan7 added 3 commits March 23, 2026 14:01

feat: add secure passthrough header forwarding with forward_headers a…

c7f5dcd

…nd extra_blocked_headers

use AnyHttpUrl for passthrough_url in provider data validator

032a371

use HttpUrl (not AnyHttpUrl) for passthrough_url, consistent with res…

30b3461

…t of codebase

skamenan7 force-pushed the feat/5040-Inference-passthrough-provider branch from cd8018e to 30b3461 Compare March 23, 2026 18:01

skamenan7 added a commit to skamenan7/llama-stack that referenced this pull request Mar 23, 2026

sync passthrough_url to HttpUrl to match upstream PR llamastack#5134 …

bfbfeb5

…update

This was referenced Mar 23, 2026

add forward_headers passthrough to remote::model-context-protocol skamenan7/llama-stack#2

Open

add forward_headers passthrough to remote::model-context-protocol #5257

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add forward_headers support to inference passthrough provider#5134

feat: add forward_headers support to inference passthrough provider#5134
skamenan7 wants to merge 3 commits intollamastack:mainfrom
skamenan7:feat/5040-Inference-passthrough-provider

skamenan7 commented Mar 13, 2026 •

edited

Loading

Uh oh!

skamenan7 commented Mar 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cdoern left a comment

Uh oh!

Uh oh!

cdoern left a comment

Uh oh!

Uh oh!

leseb commented Mar 23, 2026

Uh oh!

Uh oh!

skamenan7 commented Mar 23, 2026

CI Status / ci-status (pull_request)Failing after 7m RequiredMore actions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

skamenan7 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

skamenan7 commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cdoern left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cdoern left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leseb commented Mar 23, 2026

Uh oh!

Uh oh!

skamenan7 commented Mar 23, 2026

CI Status / ci-status (pull_request)Failing after 7m RequiredMore actions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

skamenan7 commented Mar 13, 2026 •

edited

Loading

skamenan7 commented Mar 13, 2026 •

edited

Loading