Skip to content

Investigate potential OTel context leak in vector store background tasks #5205

@iamemilio

Description

@iamemilio

Summary

PR #5168 fixed OTel context leaks in the InferenceStore and OpenAIResponsesImpl background workers by introducing create_task_with_detached_otel_context. During review, @jaideepr97 identified three additional asyncio.create_task call sites in openai_vector_store_mixin.py that may have the same bug — background tasks inheriting and permanently retaining the creating request's OTel trace context.

Potentially affected locations

All in src/llama_stack/providers/utils/memory/openai_vector_store_mixin.py:

  1. Line ~394 — Resuming a file batch on startup:

    task = asyncio.create_task(self._process_file_batch_async(batch_id, batch_info, remaining_files))
  2. Line ~1317 — Starting background processing of a new file batch:

    task = asyncio.create_task(self._process_file_batch_async(batch_id, batch_info))
  3. Line ~1327 — Running throttled cleanup of expired file batches:

    asyncio.create_task(self._cleanup_expired_file_batches())

Expected behavior

Each background task should either:

  • Start with a detached OTel context (using create_task_with_detached_otel_context), or
  • Carry per-item context from the originating request (using capture_otel_context / activate_otel_context)

so that spans are attributed to the correct request trace and don't leak across unrelated requests.

Steps to reproduce

  1. Enable OTel tracing (e.g. export to Jaeger)
  2. Send concurrent requests that trigger vector store file batch operations
  3. Inspect traces in Jaeger — look for inflated trace durations or spans from unrelated requests appearing under a single trace

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions