Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
cdb0938
Update events.yaml
singankit Jul 24, 2025
01b9ac4
Update registry.yaml
singankit Jul 24, 2025
1f8e684
Update gen-ai-events.md
singankit Jul 24, 2025
485a76e
Update gen-ai.md
singankit Jul 24, 2025
53f120d
Adding gen_ai.evaluation.ouptut.metadata attribute
singankit Jul 25, 2025
a04553f
Updating metadata attribute
singankit Jul 28, 2025
5338dee
Updating md files
singankit Jul 29, 2025
9ecd14e
Adding evaluation event header in docs
singankit Jul 29, 2025
f78bdd3
Span to capture evaluation result instead of events
singankit Aug 1, 2025
90e4b08
Updating changelog
singankit Aug 5, 2025
cdc4b0a
Fixing yamllint issues
singankit Aug 6, 2025
f334fee
Review comments updates
singankit Aug 6, 2025
7fc2e36
Evaluation result as event
singankit Aug 13, 2025
ffada83
Updating changelog
singankit Aug 13, 2025
633e801
Review comments feedback
singankit Aug 19, 2025
bc827e7
Updating docs
singankit Aug 19, 2025
f2fdb68
Updating event description
singankit Aug 19, 2025
8289d09
Updating evaluation event description
singankit Aug 19, 2025
8716e21
Updating description in md file
singankit Aug 19, 2025
8ec9658
Merge remote-tracking branch 'origin/main' into users/singankit/gen_a…
singankit Aug 20, 2025
9801b06
Updating docs and runnign checks
singankit Aug 20, 2025
48b8c9e
Regenerating docs
singankit Aug 20, 2025
2c6dd4e
Review comments reasoning to explanation
singankit Aug 20, 2025
b81da2c
Updating recommendation level for score.value and score.label
singankit Aug 20, 2025
bd8f4ee
Review comment and yamllint fix
singankit Aug 20, 2025
aaa6367
Doc review comments
singankit Aug 21, 2025
ac0a67e
Removing token usage attribute from evaluation result
singankit Aug 26, 2025
2a5f787
Merge main
singankit Aug 26, 2025
890b9aa
Rebase from main and updating md files
singankit Aug 26, 2025
ac667ef
Reviw comments for response_id attribute on evaluation result
singankit Aug 26, 2025
5096a55
Update event docs
singankit Aug 26, 2025
f6c4408
Updating doc content
singankit Aug 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 118 additions & 0 deletions docs/gen-ai/gen-ai-events.md
Original file line number Diff line number Diff line change
Expand Up @@ -519,6 +519,124 @@ Semantic conventions for individual systems MAY specify a different type for arg
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->

<!-- semconv event.gen_ai.evaluation.result -->
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
<!-- see templates/registry/markdown/snippet.md.j2 -->
<!-- prettier-ignore-start -->
<!-- markdownlint-capture -->
<!-- markdownlint-disable -->

**Status:** ![Development](https://img.shields.io/badge/-development-blue)

## Event: `gen_ai.evaluation.result`

The event name MUST be `gen_ai.evaluation.result`.

This event describes a generic GenAI response evaluation result.

| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability |
|---|---|---|---|---|---|
| [`gen_ai.evaluation.name`](/docs/registry/attributes/.md) | string | The qualified name of the evaluation used to evaluate the GenAI response. | `Relevance`; `IntentResolution` | `Required` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`error.type`](/docs/registry/attributes/error.md) | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` if and only if evaluation failed | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`gen_ai.evaluation.score`](/docs/registry/attributes/.md) | double | The score calculated by the evaluator for the GenAI response. | `4.0` | `Conditionally Required` if evaluation completed successfully | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.evaluation.input.metadata`](/docs/registry/attributes/.md) | string | Metadata associated with the evaluation input. [2] | `{\"requestId\": \"fab3ee5d-a3c6-4c47-b3de-901bf02fa045\"}` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.evaluation.metadata`](/docs/registry/attributes/.md) | string | Additional metadata associated with the evaluation. [3] | `{\"evaluator_version\": \"1.2.0\", \"gen_ai.thread.id\": \"thread_ggguJ0iZXRPjUnCy9vT9Fdvs\"}` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.evaluation.output.metadata`](/docs/registry/attributes/.md) | string | Metadata associated with the evaluation result. [4] | `{\"Perplexity\": 1.335}` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.evaluation.reasoning`](/docs/registry/attributes/.md) | string | A free-form reasoning for the assigned score provided by the evaluator. | `The response is factually accurate but lacks sufficient detail to fully address the question.` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md) | string | The Generative AI provider as identified by the client or server instrumentation. [5] | `openai`; `gcp.gen_ai`; `gcp.vertex_ai` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.usage.input_tokens`](/docs/registry/attributes/gen-ai.md) | int | The number of tokens used in the GenAI input (prompt). [6] | `100` | `Recommended` if evaluation was performed by the model | ![Development](https://img.shields.io/badge/-development-blue) |
| [`gen_ai.usage.output_tokens`](/docs/registry/attributes/gen-ai.md) | int | The number of tokens used in the GenAI response (completion). [7] | `180` | `Recommended` if evaluation was performed by the model | ![Development](https://img.shields.io/badge/-development-blue) |

**[1] `error.type`:** The `error.type` SHOULD be predictable, and SHOULD have low cardinality.

When `error.type` is set to a type (e.g., an exception type), its
canonical class name identifying the type within the artifact SHOULD be used.

Instrumentations SHOULD document the list of errors they report.

The cardinality of `error.type` within one instrumentation library SHOULD be low.
Telemetry consumers that aggregate data from multiple instrumentation libraries and applications
should be prepared for `error.type` to have high cardinality at query time when no
additional filters are applied.

If the operation has completed successfully, instrumentations SHOULD NOT set `error.type`.

If a specific domain defines its own set of error identifiers (such as HTTP or gRPC status codes),
it's RECOMMENDED to:

- Use a domain-specific attribute
- Set `error.type` to capture all errors, regardless of whether they are defined within the domain-specific set or not.

**[2] `gen_ai.evaluation.input.metadata`:** The structure is specific to the evaluator. If the metadata is structured, it is RECOMMENDED to provide it in a structured form using language-specific API. It can also be captured as a JSON string when structured API is not available. If metadata properties contain any sensitive information such as prompts or completions, corresponding properties MUST NOT be recorded by default. Instrumentations MAY provide a way to override this behavior and record sensitive information in the metadata if user explicitly allows it.

**[3] `gen_ai.evaluation.metadata`:** The structure is specific to the evaluator. If the metadata is structured, it is RECOMMENDED to provide it in a structured form using language-specific API. It can also be captured as a JSON string when structured API is not available.

**[4] `gen_ai.evaluation.output.metadata`:** The structure is specific to the evaluator. If the metadata is structured, it is RECOMMENDED to provide it in a structured form using language-specific API. It can also be captured as a JSON string when structured API is not available. If metadata properties contain any sensitive information such as prompts or completions, corresponding properties MUST NOT be recorded by default. Instrumentations MAY provide a way to override this behavior and record sensitive information in the metadata if user explicitly allows it.

**[5] `gen_ai.provider.name`:** The attribute SHOULD be set based on the instrumentation's best
knowledge and may differ from the actual model provider.

Multiple providers, including Azure OpenAI, Gemini, and AI hosting platforms
are accessible using the OpenAI REST API and corresponding client libraries,
but may proxy or host models from different providers.

The `gen_ai.request.model`, `gen_ai.response.model`, and `server.address`
attributes may help identify the actual system in use.

The `gen_ai.provider.name` attribute acts as a discriminator that
identifies the GenAI telemetry format flavor specific to that provider
within GenAI semantic conventions.
It SHOULD be set consistently with provider-specific attributes and signals.
For example, GenAI spans, metrics, and events related to AWS Bedrock
should have the `gen_ai.provider.name` set to `aws.bedrock` and include
applicable `aws.bedrock.*` attributes and are not expected to include
`openai.*` attributes.

**[6] `gen_ai.usage.input_tokens`:** The total number of input tokens used by the model during the evaluation.

**[7] `gen_ai.usage.output_tokens`:** The total number of output tokens used by the model during the evaluation.

---

`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |

---

`gen_ai.provider.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `anthropic` | [Anthropic](https://www.anthropic.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `aws.bedrock` | [AWS Bedrock](https://aws.amazon.com/bedrock) | ![Development](https://img.shields.io/badge/-development-blue) |
| `azure.ai.inference` | Azure AI Inference | ![Development](https://img.shields.io/badge/-development-blue) |
| `azure.ai.openai` | [Azure OpenAI](https://azure.microsoft.com/products/ai-services/openai-service/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `cohere` | [Cohere](https://cohere.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `deepseek` | [DeepSeek](https://www.deepseek.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `gcp.gemini` | [Gemini](https://cloud.google.com/products/gemini) [8] | ![Development](https://img.shields.io/badge/-development-blue) |
| `gcp.gen_ai` | Any Google generative AI endpoint [9] | ![Development](https://img.shields.io/badge/-development-blue) |
| `gcp.vertex_ai` | [Vertex AI](https://cloud.google.com/vertex-ai) [10] | ![Development](https://img.shields.io/badge/-development-blue) |
| `groq` | [Groq](https://groq.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `ibm.watsonx.ai` | [IBM Watsonx AI](https://www.ibm.com/products/watsonx-ai) | ![Development](https://img.shields.io/badge/-development-blue) |
| `mistral_ai` | [Mistral AI](https://mistral.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `openai` | [OpenAI](https://openai.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `perplexity` | [Perplexity](https://www.perplexity.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
| `x_ai` | [xAI](https://x.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |

**[8]:** Used when accessing the 'generativelanguage.googleapis.com' endpoint. Also known as the AI Studio API.

**[9]:** May be used when specific backend is unknown.

**[10]:** Used when accessing the 'aiplatform.googleapis.com' endpoint.

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- END AUTOGENERATED TEXT -->
<!-- endsemconv -->

## Custom events

System-specific events that are not covered in this document SHOULD be documented in corresponding Semantic Conventions extensions and
Expand Down
Loading