feat: add aggregation metrics to evaluation score card by IgnazioDS · Pull Request #1523 · lmnr-ai/lmnr

IgnazioDS · 2026-03-26T19:58:00Z

Summary

Adds additional aggregation metrics to the evaluation score card, addressing #637.

Currently the evaluation results page only shows the average of numeric scores. This PR adds:

Median — robust central tendency, less sensitive to outliers
Standard Deviation — measures score consistency/spread
Min / Max — range boundaries
Count — number of data points

These metrics are displayed in a compact grid below the existing average display, preserving the current UI hierarchy.

Design Decisions

All metrics are universally applicable — no need to distinguish between error metrics, quality scores, or classifications. This avoids the complexity discussed in Support more aggregations of numeric eval output besides average and histogram #637 around RMSE semantics.
Population std deviation (not sample) — since we're computing over the full evaluation run, not a sample.
Zero-cost for empty results — all fields default to 0 when no valid scores exist.
Comparison mode preserved — the primary average + comparison arrow UI is untouched; secondary metrics appear below.

Files Changed

File	Change
`lib/evaluation/types.ts`	Added `minValue`, `maxValue`, `stdDeviation`, `medianValue`, `count` to `EvaluationScoreStatistics`
`lib/actions/evaluation/utils.ts`	Extended `calculateScoreStatistics()` to compute all metrics
`components/evaluation/score-card.tsx`	Added secondary metrics grid below average display

Test Plan

Evaluation with numeric scores shows all 6 metrics (avg, median, std dev, min, max, count)
Evaluation with no scores shows avg = 0 and no secondary metrics grid
Comparison mode (two evaluations) still renders correctly with arrow + percentage change
Metrics are accurate: verify against manual calculation on a small dataset

Addresses #637

Note

Low Risk
Low risk UI/data-display change that extends computed score statistics; main risk is any downstream code expecting EvaluationScoreStatistics to only contain averageValue.

Overview
Extends EvaluationScoreStatistics to include medianValue, stdDeviation, minValue, maxValue, and count, and updates calculateScoreStatistics() to compute these aggregates (including population std dev and median).

Updates the evaluation score card to render a compact secondary-metrics grid (median/std dev/min/max + count) beneath the existing average/comparison display, hiding the grid when there are no valid scores.

^{Written by Cursor Bugbot for commit d6b1215. This will update automatically on new commits. Configure here.}

Extend evaluation statistics with additional aggregation metrics beyond the existing average. All metrics are universally applicable regardless of whether the score represents an error, a classification result, or a quality score. Changes: - types.ts: Expand EvaluationScoreStatistics with new fields - utils.ts: Compute median, std deviation, min, max, count - score-card.tsx: Display secondary metrics in a compact grid below the primary average display Addresses lmnr-ai#637

IgnazioDS · 2026-03-26T20:07:10Z

@Rainhunter13 @skull8888888 This adds median, std dev, min, max, and count to the evaluation score card — addresses #637. All metrics are universally applicable (no semantic metadata needed). Happy to adjust the UI layout or add/remove metrics based on your feedback!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add aggregation metrics to evaluation score card#1523

feat: add aggregation metrics to evaluation score card#1523
IgnazioDS wants to merge 1 commit intolmnr-ai:mainfrom
IgnazioDS:feat/eval-aggregation-metrics

IgnazioDS commented Mar 26, 2026 •

edited by cursor bot

Loading

Uh oh!

IgnazioDS commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

IgnazioDS commented Mar 26, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Decisions

Files Changed

Test Plan

Uh oh!

IgnazioDS commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

IgnazioDS commented Mar 26, 2026 •

edited by cursor bot

Loading