|
| 1 | +--- |
| 2 | +title: LLM Observability Metrics |
| 3 | +description: 'Learn about useful metrics you can generate from LLM Observability data.' |
| 4 | +further_reading: |
| 5 | + - link: 'llm_observability/' |
| 6 | + tag: "Documentation" |
| 7 | + text: 'Learn more about LLM Observability' |
| 8 | + - link: 'monitors/' |
| 9 | + tag: "Documentation" |
| 10 | + text: 'Create and manage monitors to notify your teams when it matters.' |
| 11 | +--- |
| 12 | + |
| 13 | +After you instrument your application with LLM Observability, you can access LLM Observability metrics for use in dashboards and monitors. These metrics capture span counts, error counts, token usage, and latency measures for your LLM applications. These metrics are calculated based on 100% of the application's traffic. |
| 14 | + |
| 15 | +<div class="alert alert-info">Other tags set on spans are not available as tags on LLM Observability metrics.</div> |
| 16 | + |
| 17 | +### Span metrics |
| 18 | + |
| 19 | +| Metric Name | Description | Metric Type | Tags | |
| 20 | +|-------------|-------------|-------------|------| |
| 21 | +| `ml_obs.span` | Total number of spans with a span kind | Count | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `span_kind`, `version` | |
| 22 | +| `ml_obs.span.duration` | Total duration of spans in seconds | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `span_kind`, `version` | |
| 23 | +| `ml_obs.span.error` | Number of errors that occurred in the span | Count | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `span_kind`, `version` | |
| 24 | + |
| 25 | +### LLM token metrics |
| 26 | + |
| 27 | +| Metric Name | Description | Metric Type | Tags | |
| 28 | +|-------------|-------------|-------------|------| |
| 29 | +| `ml_obs.span.llm.input.tokens` | Number of tokens in the input sent to the LLM | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 30 | +| `ml_obs.span.llm.output.tokens` | Number of tokens in the output | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 31 | +| `ml_obs.span.llm.prompt.tokens` | Number of tokens used in the prompt | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 32 | +| `ml_obs.span.llm.completion.tokens` | Tokens generated as a completion during the span | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 33 | +| `ml_obs.span.llm.total.tokens` | Total tokens consumed during the span (input + output + prompt) | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 34 | +| `ml_obs.span.llm.input.characters` | Number of characters in the input sent to the LLM | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 35 | +| `ml_obs.span.llm.output.characters` | Number of characters in the output | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 36 | + |
| 37 | +### Embedding metrics |
| 38 | + |
| 39 | +| Metric Name | Description | Metric Type | Tags | |
| 40 | +|-------------|-------------|-------------|------| |
| 41 | +| `ml_obs.span.embedding.input.tokens` | Number of input tokens used for generating an embedding | Distribution | `env`, `error`, `ml_app`, `model_name`, `model_provider`, `service`, `version` | |
| 42 | + |
| 43 | +### Trace metrics |
| 44 | + |
| 45 | +| Metric Name | Description | Metric Type | Tags | |
| 46 | +|-------------|-------------|-------------|------| |
| 47 | +| `ml_obs.trace` | Number of traces | Count | `env`, `error`, `ml_app`, `service`, `span_kind`, `version` | |
| 48 | +| `ml_obs.trace.duration` | Total duration of all traces across all spans | Distribution | `env`, `error`, `ml_app`, `service`, `span_kind`, `version` | |
| 49 | +| `ml_obs.trace.error` | Number of errors that occurred during the trace | Count | `env`, `error`, `ml_app`, `service`, `span_kind`, `version` | |
| 50 | + |
| 51 | +### Estimated usage metrics |
| 52 | + |
| 53 | +| Metric Name | Description | Metric Type | Tags | |
| 54 | +|-------------|-------------|-------------|------| |
| 55 | +| `ml_obs.estimated_usage.llm.input.tokens` | Estimated number of input tokens used | Distribution | `evaluation_name`, `ml_app`, `model_name`, `model_provider`, `model_server` | |
| 56 | + |
| 57 | +### Deprecated metrics |
| 58 | + |
| 59 | +<div class="alert alert-warning"> |
| 60 | +The following metrics are deprecated, and are maintained only for backward compatibility. Datadog strongly recommends using non-deprecated token metrics for all token usage measurement use cases. |
| 61 | +</div> |
| 62 | + |
| 63 | +| Metric Name | Description | Metric Type | Tags | |
| 64 | +|-------------|-------------|-------------|------| |
| 65 | +| `ml_obs.estimated_usage.llm.output.tokens` | Estimated number of output tokens generated | Distribution | `evaluation_name`, `ml_app`, `model_name`, `model_provider`, `model_server` | |
| 66 | +| `ml_obs.estimated_usage.llm.total.tokens` | Total estimated tokens (input + output) used | Distribution | `evaluation_name`, `ml_app`, `model_name`, `model_provider`, `model_server` | |
| 67 | + |
| 68 | +## Next steps |
| 69 | + |
| 70 | +{{< whatsnext desc="Make use of your LLM Observability metrics:" >}} |
| 71 | + {{< nextlink href="dashboards/" >}}Create a dashboard to track and correlate LLM Observability metrics{{< /nextlink >}} |
| 72 | + {{< nextlink href="monitors/create/" >}}Create a monitor for alerts and notifications{{< /nextlink >}} |
| 73 | +{{< /whatsnext >}} |
| 74 | + |
| 75 | + |
| 76 | +## Further Reading |
| 77 | + |
| 78 | +{{< partial name="whats-next/whats-next.html" >}} |
0 commit comments