Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 5 additions & 9 deletions docs/sources/tempo/metrics-from-traces/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ Refer to the table for a summary of these metrics and their capabilities.

| | Metrics-generator | TraceQL metrics |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Functionality | An optional component within Tempo that processes incoming spans to produce predefined metrics, specifically focusing on RED (Rate, Error, Duration) metrics and service graphs. | An experimental feature in Tempo that allows for on-the-fly computation of metrics directly from trace data using the TraceQL query language, without the need for a separate metrics storage backend. |
| Functionality | An optional component within Tempo that processes incoming spans to produce predefined metrics, specifically focusing on RED (Rate, Error, Duration) metrics and service graphs. | A feature in Tempo that allows for on-the-fly computation of metrics directly from trace data using the TraceQL query language, without the need for a separate metrics storage backend. |
| Capabilities | **Span metrics:** Calculates the total count and duration of spans based on dimensions like service name, operation, span kind, status code, and other span attributes. <br> **Service graphs**: Analyzes traces to map relationships between services, identifying transactions and recording metrics related to request counts and durations. | Ad-hoc aggregation and analysis of trace data by applying functions to trace query results, similar to how LogQL operates with logs. |
| Output | The generated metrics are written to a Prometheus-compatible database, enabling integration with time-series databases for storage and analysis. | Generates metrics dynamically at query time, facilitating flexible and detailed investigations into specific behaviors or patterns within the trace data. |
| Use case | Ideal for continuous monitoring and alerting, leveraging predefined metrics that are stored in a time-series database. Less expressive for trace-specific analysis as it focuses on standard telemetry dimensions and RED metrics. | More expressive and flexible for analyzing trace data directly, enabling complex trace-based queries and fine-grained exploration. Suited for exploratory analysis and debugging, allowing users to derive insights from trace data without prior metric definitions or storage considerations. |
| Setup | Configure the metrics-generator in the Tempo configuration file, enable processors like span metrics or service graphs, and send metrics to a Prometheus-compatible database. | Configure the local-blocks processor in overrides and in the metrics-generator configurations. |
| Setup | Configure the metrics-generator in the Tempo configuration file, enable processors like span metrics or service graphs, and send metrics to a Prometheus-compatible database. | TraceQL metrics work out of the box. Configure a Tempo data source in Grafana. |
| Query range | Supports querying over long time ranges, limited only by retention of the metrics backend. | Limited to a maximum query range of 3 hours by default (as of now), as metrics are computed from stored traces in real time. |
| Query language | Metrics are consumed using PromQL via Prometheus/Grafana. | Uses TraceQL which has a PromQL-inspired syntax, but not all PromQL features are supported; it's a similar but distinct subset with different semantics. |

Expand All @@ -38,10 +38,8 @@ By using the labels on those metrics, you can get a more granular view of reques
| Overall issues in your tracing ecosystem | Error | Number of those requests that are failing |
| Response times and latency issues | Duration | Amount of time those requests take, represented as a histogram |

The metrics-generator generates metrics from tracing data using the `services_graphs`, `span_metrics`, and `local_blocks` processors.
The `service_graphs` and `span_metrics` processors generate metrics that are written to a Prometheus-compatible backend.
The `local_blocks` processor adds support for TraceQL metrics and provides the capability of answering TraceQL metric queries to the generators without writing any data to the Prometheus backend.
The metrics-generator processes spans and write metrics using the Prometheus remote write protocol.
The metrics-generator generates metrics from tracing data using the `service_graphs` and `span_metrics` processors, which are written to a Prometheus-compatible backend.
The metrics-generator processes spans and writes metrics using the Prometheus remote write protocol.

For more information, refer to [Metrics generator](https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/metrics-generator/).

Expand All @@ -66,7 +64,7 @@ The metrics-generator automatically generates exemplars as well which allows eas

{{< figure src="/media/docs/grafana/exemplars/screenshot-exemplar-span-details.png" class="docs-image--no-shadow" max-width= "600px" caption="Span details" >}}

## TraceQL metrics (public preview)
## TraceQL metrics

Traces are a unique observability signal that contain causal relationships between the components in your system.

Expand All @@ -81,8 +79,6 @@ TraceQL metrics queries allows you to calculate metrics on trace span data on-th

TraceQL metrics, powered by the API of the same name, return Prometheus-like time series for a given metrics query.
Metrics queries apply a function to trace query results.
TraceQL metrics uses the `local_blocks` processor in metrics-generator.

TraceQL metrics power the [Grafana Traces Drilldown app](https://grafana.com/docs/grafana/<GRAFANA_VERSION>/explore/simplified-exploration/traces/).
You can explore the power of visualizing your metrics in the Grafana Traces Drilldown app using Grafana Play.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,18 @@ weight: 300
# Metrics-generator

Metrics-generator is an optional Tempo component that derives metrics from ingested traces.
If present, the distributor writes received spans to both the ingester and the metrics-generator.
The metrics-generator processes spans and writes metrics to a Prometheus data source using the Prometheus remote write protocol.
The metrics-generator consumes trace data from Kafka and writes metrics to a Prometheus data source using the Prometheus remote-write protocol.

## Architecture

Metrics-generator leverages the data available in the ingest path in Tempo to provide additional value by generating metrics from traces.
Metrics-generator consumes trace data from Kafka to generate metrics from traces.

The metrics-generator internally runs a set of **processors**.
Each processor ingests spans and produces metrics.
Every processor derives different metrics. Currently, the following processors are available:

- Service graphs
- Span metrics
- Local blocks

<p align="center"><img src="tempo-metrics-gen-overview.svg" alt="Service metrics architecture"></p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update the diagram?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very high level, nothing changes.


Expand All @@ -47,12 +45,6 @@ The more dimensions are enabled, the higher the cardinality of the generated met

To learn more about this processor, refer to the [span metrics](/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/span-metrics/) documentation.

### Local blocks

The local blocks processor stores spans for a set period of time and
enables more complex APIs to perform calculations on the data. The processor must be
enabled for certain metrics APIs to function.

## Remote writing metrics

The metrics-generator runs a Prometheus Agent that periodically sends metrics to a `remote_write` endpoint.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ refs:

<!-- Using a custom admonition because no feature flag is required. -->

{{< docs/shared source="tempo" lookup="traceql-metrics-admonition.md" version="<TEMPO_VERSION>" >}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Remove this because TraceQL metrics goes GA? If so, we should search through all of the docs for this line and make sure it's deleted. We'll have to do the same for cloud docs when the time is right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you did that. We'll just have to do clean up in the Cloud docs at the appropriate time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have the docs/shared line removed from the Cloud docs, we can delete the shared file (docs/sources/tempo/shared/traceql-metrics-admonition.md).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup!


TraceQL metrics is a feature in Grafana Tempo that creates metrics from traces.

Metric queries extend trace queries by applying a function to trace query results.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,94 +13,17 @@ keywords:

# Configure TraceQL metrics

{{< docs/shared source="tempo" lookup="traceql-metrics-admonition.md" version="<TEMPO_VERSION>" >}}

TraceQL language provides metrics queries as an experimental feature.
TraceQL language provides metrics queries as a feature.
Metric queries extend trace queries by applying a function to trace query results.
This powerful feature creates metrics from traces, much in the same way that LogQL metric queries create metrics from logs.

## Before you begin

To use the metrics generated from traces, you need to:

- Set the `local-blocks` processor to active in your `metrics-generator` configuration
- Configure a Tempo data source in Grafana or Grafana Cloud ([documentation](/docs/grafana/<GRAFANA_VERSION>/datasources/tempo/configure-tempo-data-source/))
- Access Grafana Cloud or Grafana version 10.4 or later

Refer to the [Metrics-generator configuration](http://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/#metrics-generator) documentation for more information about the `metrics-generator` configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we won't have to have local block processor set up any where for TraceQL metrics? if so, we'll need to do a check through all of the docs for any place local blocks is mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is one of the things we'll have to check in Grafana Cloud Traces docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, local-blocks is completely gone. Poof!


## Activate and configure the `local-blocks` processor

You must enable the local-blocks processor to start using metrics queries like `{ } | rate()`.
If not enabled, then the metrics queries fail with the error `localblocks processor not found`.
Enabling the `local-blocks` processor can be done either per tenant or in all tenants.

To activate the `local-blocks` processor for all users, add it to the list of processors in the `overrides` block of your Tempo configuration.

```yaml
# Global overrides configuration.
overrides:
metrics_generator_processors: ["local-blocks"]
```

To configure the processor per tenant, use the `metrics_generator_processor` override.

Example for per-tenant in the per-tenant overrides:

```yaml
overrides:
'tenantID':
metrics_generator_processors:
- local-blocks
```

By default, for all tenants in the main configuration:

```yaml
overrides:
defaults:
metrics_generator:
processors: [local-blocks]
```

Add this configuration to run TraceQL metrics queries against all spans and not just server spans:

```yaml
metrics_generator:
processor:
local_blocks:
filter_server_spans: false
```

To run metrics queries on historical data, you must configure the local-blocks processor to flush RF1 blocks to object storage:

```yaml
metrics_generator:
processor:
local_blocks:
flush_to_storage: true
```

Setting `flush_to_storage` to `true` ensures that metrics blocks are flushed to storage so TraceQL metrics queries against historical data.

If you configured Tempo using the `tempo-distributed` Helm chart, you can also set `traces_storage` using your `values.yaml` file.
Refer to the [Helm chart for an example](https://github.com/grafana-community/helm-charts/blob/main/charts/tempo-distributed/values.yaml).

For more information about overrides, refer to [Standard overrides](https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/#standard-overrides).

### Local blocks and metrics-generator in Azure blob storage and Helm

{{< admonition type="note" >}}
This configuration only applies if you are using a Helm chart, like `tempo-distributed`, to deploy Tempo.
{{< /admonition >}}

[//]: # "Shared content for localblocks and metrics-generator in Azure blob storage when using Helm"
[//]: # "This content is located in /tempo/docs/sources/shared/azure-metrics-generator.md"

{{< docs/shared source="tempo" lookup="azure-metrics-generator.md" version="<TEMPO_VERSION>" >}}

For more information, refer to [Azure hosted storage](https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/hosted-storage/azure/).

## Evaluate query timeouts

Because of their expensive nature, these queries can take a long time to run.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,6 @@ keywords:

# TraceQL metrics functions

<!-- Using a custom admonition because no feature flag is required. -->

{{< docs/shared source="tempo" lookup="traceql-metrics-admonition.md" version="<TEMPO_VERSION>" >}}

<!-- If you add a new function to this page, make sure you also add it to the _index.md#functions section.-->

TraceQL metrics query functions are aggregate operators that can be appended to any TraceQL span selector to compute time-series metrics directly from trace data.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@ keywords:

# TraceQL metrics sampling

{{< docs/shared source="tempo" lookup="traceql-metrics-admonition.md" version="<TEMPO_VERSION>" >}}

TraceQL metrics sampling dynamically and automatically chooses how to sample your tracing data to give you the highest quality signal with examining as little data as possible.
The overall performance improvement depends on the query. Heavy queries, such as `{ } | rate()`, show improvements of 2-4 times.

Expand Down Expand Up @@ -53,8 +51,7 @@ This behavior can be overridden to focus more on fixed span sampling using `with

TraceQL metrics sampling requires:

- Tempo 2.8+ with TraceQL metrics enabled
- `local-blocks` processor configured in metrics-generator ([documentation](/docs/tempo/<TEMPO_VERSION>/metrics-from-traces/metrics-queries/configure-traceql-metrics/))
- Tempo 3.0 or later
- Grafana 10.4+ or Grafana Cloud for UI integration

You can use the TraceQL query editor in the Tempo data source in Grafana or Grafana Cloud to run the sample queries.
Expand Down
Loading