OpenAI Responses API Observability

### 🚀 Describe the new functionality needed

Add observability support for the OpenAI Responses API in llama-stack. This involves integrating OpenTelemetry instrumentation so that calls made through the Responses API (both sync and streaming) produce proper traces and spans, consistent with the existing chat completions observability.

This feature depends on upstream work in the OpenTelemetry Python contrib repository. The following PRs need to be merged and released first:
-  Implement OpenAI Responses API instrumentation and examples https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4166
- Add response wrappers for OpenAI Responses API streams https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4280
- feat: OpenAI responses extractors https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4337

Once these upstream PRs are merged and a new opentelemetry-instrumentation-openai-v2 release is available, llama-stack can integrate the updated package to enable Responses API observability.

This is a sub task of https://github.com/llamastack/llama-stack/issues/2596

@leseb @iamemilio @cdoern ^^

### 💡 Why is this needed? What if we don't build it?

Llama-stack already supports the OpenAI Responses API, but there is currently no telemetry coverage for it. Without this, users have no visibility into Responses API call latency, token usage, errors, or streaming behavior through their observability stack (e.g., Jaeger, Grafana). This makes debugging and performance monitoring significantly harder for anyone using the Responses API.

### Other thoughts

- This is blocked on upstream OTel contrib PRs, no implementation work should start until those are merged and released, but we can do some early test with patched code.
- Once available, integration should be straightforward: bump the opentelemetry-instrumentation-openai-v2 dependency version and verify traces are emitted for Responses API calls.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI Responses API Observability #5192

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpenAI Responses API Observability #5192

Description

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions