Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions src/pages/docs/api/eval-tasks/eval-task-aggregations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ description: "Aggregate eval-task results as per-eval rollups, per-span pivots,
parameters={[
{"name": "eval_task_id", "in": "query", "required": true, "description": "UUID of the eval task to aggregate.", "type": "string"},
{"name": "eval_aggregation", "in": "query", "required": false, "description": "When true, returns the per-eval rollup keyed by eval name.", "type": "boolean"},
{"name": "span_aggregation", "in": "query", "required": false, "description": "When true, returns the per-span pivot keyed by span ID.", "type": "boolean"}
{"name": "span_aggregation", "in": "query", "required": false, "description": "When true, returns the per-span pivot keyed by span ID.", "type": "boolean"},
{"name": "start_date", "in": "query", "required": false, "description": "ISO-8601 lower bound on span creation time. Only spans created at or after this instant are aggregated.", "type": "string"},
{"name": "end_date", "in": "query", "required": false, "description": "ISO-8601 upper bound on span creation time. Only spans created at or before this instant are aggregated.", "type": "string"}
]}
responseExample={{
eval_task_id: "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
Expand Down Expand Up @@ -68,6 +70,12 @@ description: "Aggregate eval-task results as per-eval rollups, per-span pivots,
<ParamField query="span_aggregation" type="boolean" optional>
When `true`, the response includes the `span_aggregation` object — one entry per span the task evaluated, keyed by `span_id`, with the raw value of every eval that touched it. Defaults to `false`. At least one of `eval_aggregation` or `span_aggregation` must be `true`.
</ParamField>
<ParamField query="start_date" type="ISO-8601 datetime" optional>
Inclusive lower bound on the **span's `created_at`** — only eval runs whose linked span was created at or after this instant are aggregated. When omitted, no lower bound is applied.
</ParamField>
<ParamField query="end_date" type="ISO-8601 datetime" optional>
Inclusive upper bound on the **span's `created_at`** — only eval runs whose linked span was created at or before this instant are aggregated. When omitted, no upper bound is applied.
</ParamField>
</ApiSection>

<ApiSection title="Response" status={200} statusText="OK">
Expand Down Expand Up @@ -109,7 +117,9 @@ description: "Aggregate eval-task results as per-eval rollups, per-span pivots,
<Note>
Soft-deleted eval runs are skipped in both aggregations so the rollups reflect the user's current view of the data.

`span_aggregation` only includes span-target eval runs — session- and trace-target eval runs (where there is no underlying span) are not included.
Both `eval_aggregation` and `span_aggregation` only include span-linked eval runs — session-target eval runs (where there is no underlying span) are excluded from both rollups, regardless of whether a date range is supplied.

`start_date` and `end_date` filter on the **span's creation time** (`observation_span.created_at`), not on when the eval ran. The aggregation results therefore reflect only those spans that were created in the supplied window — eval runs against spans created outside the window are dropped from both rollups. When neither parameter is supplied, every span linked to the eval task is included.
</Note>

<ApiSection title="Errors">
Expand Down
Loading