docs: New standard dashboards (#1224)

alexmojaki · web-flow · commit 42a00a0ed1ab · 2025-07-08T08:36:09.000Z
diff --git a/docs/guides/web-ui/dashboards.md b/docs/guides/web-ui/dashboards.md
@@ -16,19 +16,38 @@ In general, it's a good idea to start with standard dashboards. If they don't me
 ### Usage Overview
 
 This dashboard is recommended for all users to [manage their costs](../../logfire-costs.md#standard-usage-dashboard).
-It breaks down your data by [service](../../reference/sql.md#service_name), [scope](../../reference/sql.md#otel_scope_name) (i.e. instrumentation), and [`span_name`](../../reference/sql.md#span_name)/`metric_name` for `records`/`metrics` respectively.
+It breaks down your data by [environment](../../reference/sql.md#deployment_environment), [service](../../reference/sql.md#service_name), [scope](../../reference/sql.md#otel_scope_name) (i.e. instrumentation), and [`span_name`](../../reference/sql.md#span_name)/`metric_name` for `records`/`metrics` respectively.
 This lets you see which services and operations are generating the most data.
 
+### Exceptions
+
+This dashboard is recommended for all users, especially for monitoring Python applications. It shows the most common exceptions grouped by [service](../../reference/sql.md#service_name), [scope](../../reference/sql.md#otel_scope_name) (i.e. instrumentation), [`span_name`](../../reference/sql.md#span_name), and [`exception_type`](../../reference/sql.md#exception_type). You can also filter by any of these four columns in the variable fields at the top.
+
+Within each row you can also see the most common [`message`](../../reference/sql.md#message) and [`exception_message`](../../reference/sql.md#exception_message) values. These are more variable (higher cardinality) which is why they don't each produce a new row. If there are multiple different values, each will be shown with a count in brackets at the start, on a separate line. **Double-click on a cell to see all the values within.** Note that `message` is often just the same as `span_name`.
+
+Exceptions are usually errors, but not always. Some exceptions are special-cased and set the [`level`](../../reference/sql.md#level) to `warn`. By default, the dashboard is filtered to `level >= 'error'`, set the 'Errors only' dropdown to 'No' to see all exceptions.
+
+Finally, scroll all the way to the right to see the 'SQL filter to copy to Live View' column to investigate the details of any group.
+
 ### Web Server Metrics
 
-This dashboard gives an overview of how long each of your web server endpoints takes to respond to requests and how often they succeed and fail. It relies on the standard OpenTelemetry http.server.duration metric which is collected by many instrumentation libraries, including those for FastAPI, Flask, Django, ASGI, and WSGI. Each chart is both a timeline and a breakdown by endpoint. Hover over each chart to see the most impactful endpoint at the top of the tooltip. The charts show:
+This dashboard gives an overview of how long each of your web server endpoints takes to respond to requests and how often they succeed and fail. It relies on the standard OpenTelemetry `http.server.duration`/`http.server.request.duration` metric which is collected by many instrumentation libraries, including those for FastAPI, Flask, Django, ASGI, and WSGI. The charts give a breakdown by endpoint (and sometimes status code) both overall and over time. Hover over each time series to see the most impactful endpoint at the top of the tooltip. The charts show:
 
 - **Total duration:** Endpoints which need to either be optimized or called less often.
 - **Average duration:** Endpoints which are slow on average and need to be optimized.
 - **2xx request count:** Number of successful requests (HTTP status code between 200 and 299) per endpoint.
 - **5xx request count:** Number of server errors (HTTP status code of 500 or greater) per endpoint.
 - **4xx request count:** Number of bad requests (HTTP status code between 400 and 499) per endpoint.
 
+### Token Usage
+
+This dashboard breaks down input and output LLM token usage by model. It comes in two variants. Both have the same charts, but they use different data sources:
+
+- **Token Usage (from `records`):** Uses data from the `records` table, specifically span [attributes](../../reference/sql.md#attributes) following OpenTelemetry conventions. This variant works with more instrumentations, as some don't emit metrics. It's also easier to [use as a template](#using-a-standard-dashboard-as-a-template) if you want to filter by other attributes.
+- **Token Usage (from `metrics`):** Uses data from the `metrics` table, specifically the [`gen_ai.client.token.usage`](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage) metric. This variant is more performant, so you can load data over bigger time ranges more quickly. It's also more accurate if your spans are [sampled](../../how-to-guides/sampling.md).
+
+If you're only using the [Pydantic AI](../../integrations/llms/pydanticai.md) instrumentation, and you have version 0.2.17 of Pydantic AI or later, we recommend using the `metrics` variant. Otherwise we suggest enabling both variants and checking. If they look roughly identical (some small differences are expected), you can disable the `records` variant to improve performance.
+
 ### Basic System Metrics
 
 This dashboard shows essential system resource utilization metrics. It comes in two variants: