feat: add aggregation metrics to evaluation score card#1523
Open
IgnazioDS wants to merge 1 commit intolmnr-ai:mainfrom
Open
feat: add aggregation metrics to evaluation score card#1523IgnazioDS wants to merge 1 commit intolmnr-ai:mainfrom
IgnazioDS wants to merge 1 commit intolmnr-ai:mainfrom
Conversation
Extend evaluation statistics with additional aggregation metrics beyond the existing average. All metrics are universally applicable regardless of whether the score represents an error, a classification result, or a quality score. Changes: - types.ts: Expand EvaluationScoreStatistics with new fields - utils.ts: Compute median, std deviation, min, max, count - score-card.tsx: Display secondary metrics in a compact grid below the primary average display Addresses lmnr-ai#637
Author
|
@Rainhunter13 @skull8888888 This adds median, std dev, min, max, and count to the evaluation score card — addresses #637. All metrics are universally applicable (no semantic metadata needed). Happy to adjust the UI layout or add/remove metrics based on your feedback! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds additional aggregation metrics to the evaluation score card, addressing #637.
Currently the evaluation results page only shows the average of numeric scores. This PR adds:
These metrics are displayed in a compact grid below the existing average display, preserving the current UI hierarchy.
Design Decisions
Files Changed
lib/evaluation/types.tsminValue,maxValue,stdDeviation,medianValue,counttoEvaluationScoreStatisticslib/actions/evaluation/utils.tscalculateScoreStatistics()to compute all metricscomponents/evaluation/score-card.tsxTest Plan
Addresses #637
Note
Low Risk
Low risk UI/data-display change that extends computed score statistics; main risk is any downstream code expecting
EvaluationScoreStatisticsto only containaverageValue.Overview
Extends
EvaluationScoreStatisticsto includemedianValue,stdDeviation,minValue,maxValue, andcount, and updatescalculateScoreStatistics()to compute these aggregates (including population std dev and median).Updates the evaluation score card to render a compact secondary-metrics grid (median/std dev/min/max + count) beneath the existing average/comparison display, hiding the grid when there are no valid scores.
Written by Cursor Bugbot for commit d6b1215. This will update automatically on new commits. Configure here.