Skip to content

Extract LLM attributes from any OTEL convention, not just GenAI#2431

Open
csansoon wants to merge 1 commit intolatitude-v2from
convention-agnostic-span-processor
Open

Extract LLM attributes from any OTEL convention, not just GenAI#2431
csansoon wants to merge 1 commit intolatitude-v2from
convention-agnostic-span-processor

Conversation

@csansoon
Copy link
Contributor

The span ingestion pipeline only recognized gen_ai.* attribute keys when extracting LLM-specific fields into promoted ClickHouse columns. Spans arriving via OpenInference, OpenLLMetry, Vercel AI SDK, or OpenAI Agents SDK had their LLM columns left empty despite carrying equivalent data under different keys and vocabularies.

This introduces a multi-convention extraction layer with two components: scalar attribute resolvers and content payload parsers.

Each promoted column is resolved from a priority-ordered list of convention-specific candidates. The first candidate that returns a value wins. Value translation is applied where conventions use different vocabularies.

| Column | GenAI current | GenAI deprecated / OpenLLMetry | OpenInference | Vercel AI SDK | |---|---|---|---|---|
| operation | gen_ai.operation.name | llm.request.type (maps completiontext_completion, embeddingembeddings, etc.) | openinference.span.kind (maps LLMchat, EMBEDDINGembeddings, TOOLexecute_tool, etc.) | ai.operationId (maps ai.generateTextchat, ai.toolCallexecute_tool, etc.) | | provider | gen_ai.provider.name | gen_ai.system (aliases bedrockaws.bedrock, geminigcp.gemini, mistralmistral_ai, etc.) | llm.system (aliases mistralaimistral_ai, xaix_ai, vertexaigcp.vertex_ai) | ai.model.provider (strips .chat/.messages/.responses suffixes, aliases google.generative-aigcp.gemini, amazon-bedrockaws.bedrock) | | model | gen_ai.request.model | same | llm.model_name, embedding.model_name, reranker.model_name | ai.model.id | | response_model | gen_ai.response.model | same | llm.model_name (no request/response distinction) | ai.response.model | | tokens_input | gen_ai.usage.input_tokens | gen_ai.usage.prompt_tokens | llm.token_count.prompt | ai.usage.promptTokens | | tokens_output | gen_ai.usage.output_tokens | gen_ai.usage.completion_tokens | llm.token_count.completion | ai.usage.completionTokens | | tokens_cache_read | gen_ai.usage.cache_read.input_tokens | same | llm.token_count.prompt_details.cache_read | — | | tokens_cache_create | gen_ai.usage.cache_creation.input_tokens | same | llm.token_count.prompt_details.cache_write | — | | tokens_reasoning | gen_ai.usage.reasoning_tokens | same | llm.token_count.completion_details.reasoning | — | | response_id | gen_ai.response.id | same | — | ai.response.id | | finish_reasons | gen_ai.response.finish_reasons (string[]) | same | — | ai.response.finishReason (singular string, wrapped to array; tool-callstool_calls, content-filtercontent_filter) | | session_id | gen_ai.conversation.id | same | session.id | — | | cost_*_microcents | — | gen_ai.usage.cost (total only, USD float→microcents) | llm.cost.prompt, llm.cost.completion, llm.cost.total (USD float→microcents) | — |

OpenAI Agents SDK spans are handled implicitly — when bridged to OTEL via the official instrumentor, they emit GenAI convention attributes.

LLM message payloads use fundamentally different storage structures across conventions, so each gets a dedicated parser with sentinel-based detection:

  • GenAI current (sentinel: gen_ai.input.messages or gen_ai.output.messages): Parses structured/JSON messages already in GenAI parts-based format. Extracts gen_ai.system_instructions and gen_ai.tool.definitions as dedicated attributes.

  • GenAI deprecated / OpenLLMetry (sentinel: gen_ai.prompt or gen_ai.completion): Parses flat JSON strings containing {role, content} message arrays. Translates to GenAI format via rosetta-ai auto-detection. Extracts llm.request.functions for tool definitions.

  • OpenInference (sentinel: llm.input_messages.* prefix or openinference.span.kind): Reassembles flattened indexed span attributes (llm.input_messages.{i}.message.role, .content, .tool_calls.{j}.tool_call.function.name, etc.) by scanning, grouping by index, and sorting. Reconstructs llm.tools.{i}.tool.json_schema for tool definitions. Translates reassembled messages via rosetta-ai.

  • Vercel AI SDK (sentinel: ai.prompt or ai.prompt.messages): Handles both top-level spans (ai.prompt JSON with system + messages fields) and call-level spans (ai.prompt.messages JSON array). Reconstructs output from split ai.response.text + ai.response.toolCalls. Parses ai.prompt.tools string array for tool definitions. Translates via rosetta-ai with explicit Provider.VercelAI.

All raw span attributes remain in the dynamic attr_* maps regardless of whether they were also extracted to promoted columns.

The span ingestion pipeline only recognized `gen_ai.*` attribute keys when extracting LLM-specific fields into promoted ClickHouse columns. Spans arriving via OpenInference, OpenLLMetry, Vercel AI SDK, or OpenAI Agents SDK had their LLM columns left empty despite carrying equivalent data under different keys and vocabularies.

This introduces a multi-convention extraction layer with two components: **scalar attribute resolvers** and **content payload parsers**.

Each promoted column is resolved from a priority-ordered list of convention-specific candidates. The first candidate that returns a value wins. Value translation is applied where conventions use different vocabularies.

| Column | GenAI current | GenAI deprecated / OpenLLMetry | OpenInference | Vercel AI SDK |
|---|---|---|---|---|
| `operation` | `gen_ai.operation.name` | `llm.request.type` (maps `completion`→`text_completion`, `embedding`→`embeddings`, etc.) | `openinference.span.kind` (maps `LLM`→`chat`, `EMBEDDING`→`embeddings`, `TOOL`→`execute_tool`, etc.) | `ai.operationId` (maps `ai.generateText`→`chat`, `ai.toolCall`→`execute_tool`, etc.) |
| `provider` | `gen_ai.provider.name` | `gen_ai.system` (aliases `bedrock`→`aws.bedrock`, `gemini`→`gcp.gemini`, `mistral`→`mistral_ai`, etc.) | `llm.system` (aliases `mistralai`→`mistral_ai`, `xai`→`x_ai`, `vertexai`→`gcp.vertex_ai`) | `ai.model.provider` (strips `.chat`/`.messages`/`.responses` suffixes, aliases `google.generative-ai`→`gcp.gemini`, `amazon-bedrock`→`aws.bedrock`) |
| `model` | `gen_ai.request.model` | same | `llm.model_name`, `embedding.model_name`, `reranker.model_name` | `ai.model.id` |
| `response_model` | `gen_ai.response.model` | same | `llm.model_name` (no request/response distinction) | `ai.response.model` |
| `tokens_input` | `gen_ai.usage.input_tokens` | `gen_ai.usage.prompt_tokens` | `llm.token_count.prompt` | `ai.usage.promptTokens` |
| `tokens_output` | `gen_ai.usage.output_tokens` | `gen_ai.usage.completion_tokens` | `llm.token_count.completion` | `ai.usage.completionTokens` |
| `tokens_cache_read` | `gen_ai.usage.cache_read.input_tokens` | same | `llm.token_count.prompt_details.cache_read` | — |
| `tokens_cache_create` | `gen_ai.usage.cache_creation.input_tokens` | same | `llm.token_count.prompt_details.cache_write` | — |
| `tokens_reasoning` | `gen_ai.usage.reasoning_tokens` | same | `llm.token_count.completion_details.reasoning` | — |
| `response_id` | `gen_ai.response.id` | same | — | `ai.response.id` |
| `finish_reasons` | `gen_ai.response.finish_reasons` (string[]) | same | — | `ai.response.finishReason` (singular string, wrapped to array; `tool-calls`→`tool_calls`, `content-filter`→`content_filter`) |
| `session_id` | `gen_ai.conversation.id` | same | `session.id` | — |
| `cost_*_microcents` | — | `gen_ai.usage.cost` (total only, USD float→microcents) | `llm.cost.prompt`, `llm.cost.completion`, `llm.cost.total` (USD float→microcents) | — |

OpenAI Agents SDK spans are handled implicitly — when bridged to OTEL via the official instrumentor, they emit GenAI convention attributes.

LLM message payloads use fundamentally different storage structures across conventions, so each gets a dedicated parser with sentinel-based detection:

- **GenAI current** (sentinel: `gen_ai.input.messages` or `gen_ai.output.messages`): Parses structured/JSON messages already in GenAI parts-based format. Extracts `gen_ai.system_instructions` and `gen_ai.tool.definitions` as dedicated attributes.

- **GenAI deprecated / OpenLLMetry** (sentinel: `gen_ai.prompt` or `gen_ai.completion`): Parses flat JSON strings containing `{role, content}` message arrays. Translates to GenAI format via `rosetta-ai` auto-detection. Extracts `llm.request.functions` for tool definitions.

- **OpenInference** (sentinel: `llm.input_messages.*` prefix or `openinference.span.kind`): Reassembles flattened indexed span attributes (`llm.input_messages.{i}.message.role`, `.content`, `.tool_calls.{j}.tool_call.function.name`, etc.) by scanning, grouping by index, and sorting. Reconstructs `llm.tools.{i}.tool.json_schema` for tool definitions. Translates reassembled messages via `rosetta-ai`.

- **Vercel AI SDK** (sentinel: `ai.prompt` or `ai.prompt.messages`): Handles both top-level spans (`ai.prompt` JSON with `system` + `messages` fields) and call-level spans (`ai.prompt.messages` JSON array). Reconstructs output from split `ai.response.text` + `ai.response.toolCalls`. Parses `ai.prompt.tools` string array for tool definitions. Translates via `rosetta-ai` with explicit `Provider.VercelAI`.

All raw span attributes remain in the dynamic `attr_*` maps regardless of whether they were also extracted to promoted columns.
@csansoon csansoon force-pushed the convention-agnostic-span-processor branch from d65f288 to 9068ae4 Compare March 13, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Inbox

Development

Successfully merging this pull request may close these issues.

1 participant