Investigate potential OTel context leak in vector store background tasks

## Summary

PR #5168 fixed OTel context leaks in the `InferenceStore` and `OpenAIResponsesImpl` background workers by introducing `create_task_with_detached_otel_context`. During review, @jaideepr97 [identified](https://github.com/llamastack/llama-stack/pull/5168#issuecomment-4083299738) three additional `asyncio.create_task` call sites in `openai_vector_store_mixin.py` that may have the same bug — background tasks inheriting and permanently retaining the creating request's OTel trace context.

## Potentially affected locations

All in `src/llama_stack/providers/utils/memory/openai_vector_store_mixin.py`:

1. **Line ~394** — Resuming a file batch on startup:
   ```python
   task = asyncio.create_task(self._process_file_batch_async(batch_id, batch_info, remaining_files))
   ```

2. **Line ~1317** — Starting background processing of a new file batch:
   ```python
   task = asyncio.create_task(self._process_file_batch_async(batch_id, batch_info))
   ```

3. **Line ~1327** — Running throttled cleanup of expired file batches:
   ```python
   asyncio.create_task(self._cleanup_expired_file_batches())
   ```

## Expected behavior

Each background task should either:
- Start with a detached OTel context (using `create_task_with_detached_otel_context`), or
- Carry per-item context from the originating request (using `capture_otel_context` / `activate_otel_context`)

so that spans are attributed to the correct request trace and don't leak across unrelated requests.

## Steps to reproduce

1. Enable OTel tracing (e.g. export to Jaeger)
2. Send concurrent requests that trigger vector store file batch operations
3. Inspect traces in Jaeger — look for inflated trace durations or spans from unrelated requests appearing under a single trace

## Additional context

- See PR #5168 for full explanation of the root cause (`asyncio.create_task` copies all `contextvars` at creation time)
- The fix utilities (`create_task_with_detached_otel_context`, `capture_otel_context`, `activate_otel_context`) are already available in `llama_stack/core/task.py`
- These locations may behave differently from the ones fixed in #5168 (e.g. shorter-lived tasks vs. long-lived workers), so testing is needed to confirm whether the leak actually manifests here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate potential OTel context leak in vector store background tasks #5205

Summary

Potentially affected locations

Expected behavior

Steps to reproduce

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate potential OTel context leak in vector store background tasks #5205

Description

Summary

Potentially affected locations

Expected behavior

Steps to reproduce

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions