Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions util/opentelemetry-util-genai-dev/README.traceloop_translator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Traceloop -> GenAI Semantic Convention Translator Emitter

This optional emitter promotes legacy `traceloop.*` attributes attached to an `LLMInvocation` into
Semantic Convention (or forward-looking custom `gen_ai.*`) attributes **before** the standard
Semantic Convention span emitter runs. It does **not** create its own span.

## Why Use It?
If you have upstream code (or the Traceloop compat emitter) producing `traceloop.*` keys but you
want downstream dashboards/tools to rely on GenAI semantic conventions, enabling this translator
lets you transition without rewriting upstream code immediately.

## What It Does
At `on_start` of an `LLMInvocation` it scans `invocation.attributes` for keys beginning with
`traceloop.` and (non-destructively) adds corresponding keys:

| Traceloop Key (prefixed or raw) | Added Key | Notes |
|---------------------------------|---------------------------|-------|
| `traceloop.workflow.name` / `workflow.name` | `gen_ai.workflow.name` | Custom (not yet in spec) |
| `traceloop.entity.name` / `entity.name` | `gen_ai.agent.name` | Approximates entity as agent name |
| `traceloop.entity.path` / `entity.path` | `gen_ai.workflow.path` | Custom placeholder |
| `traceloop.callback.name` / `callback.name` | `gen_ai.callback.name` | Also sets `gen_ai.operation.source` if absent |
| `traceloop.callback.id` / `callback.id` | `gen_ai.callback.id` | Custom |
| `traceloop.entity.input` / `entity.input` | `gen_ai.input.messages` | Serialized form already present |
| `traceloop.entity.output` / `entity.output` | `gen_ai.output.messages`| Serialized form already present |

Existing `gen_ai.*` keys are never overwritten.

## Enabling
Fast path (no entry point needed):

```bash
export OTEL_GENAI_ENABLE_TRACELOOP_TRANSLATOR=1
export OTEL_INSTRUMENTATION_GENAI_EMITTERS=span,traceloop_compat

Optional (remove original traceloop.* after promotion):
export OTEL_GENAI_TRACELOOP_TRANSLATOR_STRIP_LEGACY=1
```

The flag auto-prepends the translator before the semantic span emitter. You can still add
`traceloop_translator` explicitly once an entry point is created.

You can also load this emitter the same way as other extra emitters. There are two common patterns:

### 1. Via `OTEL_INSTRUMENTATION_GENAI_EMITTERS` with an extra token
If your emitter loading logic supports extra entry-point based names directly (depending on branch state), add the translator token (e.g. `traceloop_translator`). Example:

```bash
export OTEL_INSTRUMENTATION_GENAI_EMITTERS=span,traceloop_translator,traceloop_compat
```

Ordering is important: we request placement `before=semconv_span` in the spec, but if your environment override reorders span emitters you can enforce explicitly (see next section).

### 2. Using Category Override Environment Variable
If your build supports category overrides (as implemented in `configuration.py`), you can prepend:

```bash
export OTEL_INSTRUMENTATION_GENAI_EMITTERS=span,traceloop_compat
export OTEL_INSTRUMENTATION_GENAI_EMITTERS_SPAN=prepend:TraceloopTranslator
```

The override ensures the translator emitter runs before the semantic span emitter regardless of default resolution order.

## Example
Minimal Python snippet (assuming emitters are loaded via entry points and the translator is installed):

```python
from opentelemetry.util.genai.handler import get_telemetry_handler
from opentelemetry.util.genai.types import LLMInvocation, InputMessage, OutputMessage, Text

inv = LLMInvocation(
request_model="gpt-4",
input_messages=[InputMessage(role="user", parts=[Text("Hello")])],
attributes={
"traceloop.entity.name": "ChatLLM",
"traceloop.workflow.name": "user_flow",
"traceloop.callback.name": "root_chain",
"traceloop.entity.input": "[{'role':'user','content':'Hello'}]",
},
)
handler = get_telemetry_handler()
handler.start_llm(inv)
inv.output_messages = [OutputMessage(role="assistant", parts=[Text("Hi")], finish_reason="stop")]
handler.stop_llm(inv)
# Result: final semantic span contains gen_ai.agent.name, gen_ai.workflow.name, gen_ai.input.messages, etc.
```

## Non-Goals
- It does not remove or rename original `traceloop.*` attributes (no destructive behavior yet).
- It does not attempt deep semantic inference; mappings are intentionally conservative.
- It does not serialize messages itself—relies on upstream emitters to have placed serialized content already.
41 changes: 41 additions & 0 deletions util/opentelemetry-util-genai-dev/README.translator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Translator

## Automatic Span Processing (Recommended)

Add `TraceloopSpanProcessor` to your TracerProvider to automatically transform all matching spans:

```python
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.util.genai.processors import TraceloopSpanProcessor

# Set up tracer provider
provider = TracerProvider()

# Add processor - transforms all matching spans automatically
processor = TraceloopSpanProcessor(
attribute_transformations={
"remove": ["debug_info"],
"rename": {"model_ver": "llm.model.version"},
"add": {"service.name": "my-llm"}
},
name_transformations={"chat *": "llm.openai.chat"},
traceloop_attributes={
"traceloop.entity.name": "MyLLMEntity"
}
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

```

## Transformation Rules

### Attributes
- **Remove**: `"remove": ["field1", "field2"]`
- **Rename**: `"rename": {"old_name": "new_name"}`
- **Add**: `"add": {"key": "value"}`

### Span Names
- **Direct**: `"old name": "new name"`
- **Pattern**: `"chat *": "llm.chat"` (wildcard matching)
41 changes: 41 additions & 0 deletions util/opentelemetry-util-genai-dev/TRANSLATOR_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Translator

## Automatic Span Processing (Recommended)

Add `TraceloopSpanProcessor` to your TracerProvider to automatically transform all matching spans:

```python
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.util.genai.processors import TraceloopSpanProcessor

# Set up tracer provider
provider = TracerProvider()

# Add processor - transforms all matching spans automatically
processor = TraceloopSpanProcessor(
attribute_transformations={
"remove": ["debug_info"],
"rename": {"model_ver": "llm.model.version"},
"add": {"service.name": "my-llm"}
},
name_transformations={"chat *": "llm.openai.chat"},
traceloop_attributes={
"traceloop.entity.name": "MyLLMEntity"
}
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

```

## Transformation Rules

### Attributes
- **Remove**: `"remove": ["field1", "field2"]`
- **Rename**: `"rename": {"old_name": "new_name"}`
- **Add**: `"add": {"key": "value"}`

### Span Names
- **Direct**: `"old name": "new name"`
- **Pattern**: `"chat *": "llm.chat"` (wildcard matching)
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env python3

from __future__ import annotations

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
SimpleSpanProcessor,
ConsoleSpanExporter,
)

from opentelemetry.util.genai.handler import get_telemetry_handler
from opentelemetry.util.genai.types import (
LLMInvocation,
InputMessage,
OutputMessage,
Text,
)


def run_example():
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)

# Build a telemetry handler (singleton) – emitters are chosen via env vars
handler = get_telemetry_handler(tracer_provider=provider)

# Include a few illustrative Traceloop-style attributes.
# These will be mapped/prefixed automatically by the Traceloop compat emitter.
invocation = LLMInvocation(
request_model="gpt-4",
input_messages=[InputMessage(role="user", parts=[Text("Hello")])],
attributes={
"custom.attribute": "value", # arbitrary user attribute
"traceloop.entity.name": "ChatLLM",
"traceloop.workflow.name": "main_flow",
"traceloop.entity.path": "root/branch/leaf",
"traceloop.entity.input": "Hi"
},
)

handler.start_llm(invocation)
# Simulate model output
invocation.output_messages = [
OutputMessage(
role="assistant", parts=[Text("Hi there!")], finish_reason="stop"
)
]
handler.stop_llm(invocation)

print("\nInvocation complete. Check exporter output above for:"
"\n * SemanticConvention span containing promoted gen_ai.* keys"
"\n * Traceloop compat span (legacy format)"
"\nIf translator emitter enabled, attributes like gen_ai.agent.name should be present.\n")


if __name__ == "__main__":
run_example()
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import annotations

import logging
import os
from dataclasses import dataclass
from types import MethodType
from typing import Any, Dict, Iterable, List, Sequence
Expand Down Expand Up @@ -99,6 +100,25 @@ def _register(spec: EmitterSpec) -> None:
target.append(spec)
spec_registry[spec.name] = spec

if os.getenv("OTEL_GENAI_ENABLE_TRACELOOP_TRANSLATOR"):
try:
from .traceloop_translator import (
TraceloopTranslatorEmitter, # type: ignore
)

_register(
EmitterSpec(
name="TraceloopTranslator",
category=_CATEGORY_SPAN,
factory=lambda ctx: TraceloopTranslatorEmitter(),
mode="prepend", # ensure it runs before semantic span emitter
)
)
except Exception: # pragma: no cover - defensive
_logger.exception(
"Failed to initialize TraceloopTranslator emitter despite flag set"
)

if settings.enable_span and not settings.only_traceloop_compat:
_register(
EmitterSpec(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -254,10 +254,16 @@ def on_start(
elif isinstance(invocation, EmbeddingInvocation):
self._start_embedding(invocation)
else:
# Use operation field for span name (defaults to "chat")
operation = getattr(invocation, "operation", "chat")
model_name = invocation.request_model
span_name = f"{operation} {model_name}"
# Use override if processor supplied one; else operation+model
override = getattr(invocation, "attributes", {}).get(
"gen_ai.override.span_name"
)
if override:
span_name = str(override)
else:
operation = getattr(invocation, "operation", "chat")
model_name = invocation.request_model
span_name = f"{operation} {model_name}"
cm = self._tracer.start_as_current_span(
span_name, kind=SpanKind.CLIENT, end_on_exit=False
)
Expand Down
Loading
Loading