Skip to content

Commit ff75842

Browse files
authored
fix(ai): normalize boolean scores in onlineEval scoresSummary (#263)
## Overview - `onlineEval()` was writing raw boolean scores (`true`/`false`) into the parent eval span's `eval.case.scores` attribute, while child scorer spans correctly normalized them to `1`/`0` with `eval.score.is_boolean` metadata via `normalizeBooleanScore()` - Apply the same `normalizeBooleanScore()` call when building `scoresSummary` so both parent and child spans produce consistent numeric scores <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Small telemetry-only change that affects how scores are serialized into span attributes; low risk aside from potential downstream expectations of boolean values. > > **Overview** > Ensures `onlineEval()` writes consistent numeric scores into the parent eval span’s `eval.case.scores` summary by normalizing boolean `score` values (`true/false` → `1/0`) and propagating the corresponding `eval.score.is_boolean` metadata. > > This updates `onlineEval.ts` to call `normalizeBooleanScore()` while building `scoresSummary`, and only emits normalized metadata when non-empty. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit bfa6ce7. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
1 parent 87f5add commit ff75842

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

packages/ai/src/online-evals/onlineEval.ts

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ import type {
99
ScorerSampling,
1010
} from './types';
1111
import { executeScorer } from './executor';
12+
import { normalizeBooleanScore } from '../evals/normalize-score';
1213
import { Attr } from '../otel/semconv/attributes';
1314
import type { ValidateName } from '../util/name-validation';
1415
import { isValidName } from '../util/name-validation-runtime';
@@ -359,11 +360,16 @@ async function executeOnlineEvalInternal<
359360

360361
const scoresSummary: Record<string, ScorerResult> = {};
361362
for (const [name, result] of Object.entries(results)) {
363+
const { score: normalizedScore, metadata: normalizedMetadata } = normalizeBooleanScore(
364+
result.score,
365+
result.metadata,
366+
);
367+
362368
scoresSummary[name] = {
363369
name: result.name,
364-
score: result.score,
365-
...(result.metadata &&
366-
Object.keys(result.metadata).length > 0 && { metadata: result.metadata }),
370+
score: normalizedScore,
371+
...(normalizedMetadata &&
372+
Object.keys(normalizedMetadata).length > 0 && { metadata: normalizedMetadata }),
367373
...(result.error && { error: result.error }),
368374
};
369375
}

0 commit comments

Comments
 (0)