feat!: Add support for real time judge evals #969

jsonbailey · 2025-10-29T19:04:34Z

feat: Added judgeConfig method to AI SDK to retrieve an AI Judge Config
feat: Added createJudge method to create a Judge based on the judge key provided
feat: Added trackEvalScores method to config tracker
feat: Chat will evaluate responses with configured judges
fix!: AI Config defaults require the "enabled" attribute
fix!: Renamed LDAIAgentConfig to LDAIAgentConfigRequest for improved clarity
fix!: Renamed LDAIAgent to LDAIAgentConfig *note the previous use of this name
fix!: Renamed LDAIAgentDefault to LDAIAgentConfigDefault for improved clarity
fix!: Renamed LDAIDefaults to LDAICompletionConfigDefault for improved clarity

Note

Introduces Judge evaluations (judgeConfig/createJudge) with structured outputs and automatic chat scoring, refactors config types/modes, updates tracking, and deprecates older APIs.

Judging & Evaluation:
- Add Judge with structured output (invokeStructuredModel) and schema builder; new judgeConfig and createJudge APIs.
- TrackedChat can attach judges and asynchronously evaluate responses; results tracked via trackEvalScores.
Core Client/API:
- New completionConfig, agentConfig, judgeConfig, agentConfigs, createChat, createJudge methods; legacy config, agent, agents, initChat deprecated (now wrappers).
- Mode-aware evaluation with LDAIConfigUtils and disabled-return on mode mismatch.
Types & Structure:
- Consolidate config types in api/config/types with modes: completion | agent | judge and new defaults (LDAICompletionConfigDefault, LDAIAgentConfigDefault, LDAIJudgeConfigDefault).
- Replace old LDAIConfig/agent types; remove api/agents module; add judge attachment (LDJudgeConfiguration).
Providers:
- Extend AIProvider with default invokeModel and new invokeStructuredModel; update AIProviderFactory to accept union config kinds.
Tracking:
- Expose getTrackData; add trackEvalScores; new tracking event keys; minor tracker refactors.
Docs/Examples/Tests:
- README and examples updated (createChat, enabled checks); comprehensive tests for Judge, client config APIs, and TrackedChat.

^{Written by Cursor Bugbot for commit a187fcc. This will update automatically on new commits. Configure here.}

github-actions · 2025-10-29T19:07:09Z

@launchdarkly/browser size report
This is the brotli compressed size of the ESM build.
Compressed size: 169118 bytes
Compressed size limit: 200000
Uncompressed size: 789399 bytes

github-actions · 2025-10-29T19:07:32Z

@launchdarkly/js-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 24988 bytes
Compressed size limit: 26000
Uncompressed size: 122411 bytes

github-actions · 2025-10-29T19:07:41Z

@launchdarkly/js-client-sdk size report
This is the brotli compressed size of the ESM build.
Compressed size: 21721 bytes
Compressed size limit: 25000
Uncompressed size: 74698 bytes

github-actions · 2025-10-29T19:07:46Z

@launchdarkly/js-client-sdk-common size report
This is the brotli compressed size of the ESM build.
Compressed size: 17636 bytes
Compressed size limit: 20000
Uncompressed size: 90259 bytes

tanderson-ld · 2025-10-30T14:23:53Z

Will feat! cause it to release a non-alpha version?

kinyoklion · 2025-10-30T15:51:53Z

Will feat! cause it to release a non-alpha version?

If this is set:
"bump-minor-pre-major": true,

Then it will only be a minor.

packages/sdk/server-ai/src/LDAIClientImpl.ts

packages/sdk/server-ai/src/api/config/types.ts

packages/sdk/server-ai/src/api/judge/Judge.ts

jsonbailey · 2025-10-31T17:10:41Z

Will feat! cause it to release a non-alpha version?

If this is set: "bump-minor-pre-major": true,

Then it will only be a minor.

I only want it to be a minor, but I want the proper change logs to show breaking even for the minor bump.

cursor · 2025-11-05T16:25:34Z

packages/sdk/server-ai/src/LDAIClientImpl.ts

+    this._ldClient.track(TRACK_CONFIG_SINGLE, context, key, 1);
+
+    const config = await this._evaluate(key, context, defaultValue, 'completion', variables);
+    return this._addVercelAISDKSupport(config as LDAICompletionConfig);


Bug

When completionConfig receives a disabled config from _evaluate (due to mode mismatch), it still calls _addVercelAISDKSupport on it. However, the disabled config returned by LDAIConfigUtils.createDisabledConfig is cast to LDAICompletionConfig but doesn't actually have the proper structure. The _addVercelAISDKSupport method will try to access config.messages which will be undefined for disabled configs, and create a mapper with undefined messages. While this may not crash, it's inconsistent behavior - disabled configs shouldn't have the toVercelAISDK method added since they can't be used anyway. The test at line 127 in the diff expects toVercelAISDK to be present even for disabled configs, but this creates a misleading API where a disabled config appears to have functionality it shouldn't use.

This method is deprecated and will be removed in a separate PR.

packages/sdk/server-ai/src/api/chat/TrackedChat.ts

feat!: Add support for real time judge evals

884bfc7

jsonbailey requested a review from a team as a code owner October 29, 2025 19:04

This comment was marked as outdated.

Sign in to view

jsonbailey mentioned this pull request Oct 29, 2025

feat!: Support invoke with structured output in LangChain provider #970

Merged

jsonbailey requested a review from a team October 29, 2025 19:40

jsonbailey added 2 commits October 30, 2025 03:19

return the configs with trackers attached to stay consistent in ai sdk

af43b47

set defaults for ai provider and adjust some type names for ai sdk

37b54e7

tanderson-ld reviewed Oct 30, 2025

View reviewed changes

jsonbailey added 5 commits October 31, 2025 21:22

fix: Update examples to include required params

4bf4c02

additional tests and code review feedback for ai sdk

5d48342

fix: enable second judge interpolation

2f9854c

feat: Automatically judge chat results based on AI Config

2bf8872

provide clearer method names

b83c1e3

tanderson-ld self-requested a review November 3, 2025 15:22

tanderson-ld approved these changes Nov 3, 2025

View reviewed changes

jsonbailey added 3 commits November 4, 2025 22:15

Merge branch 'main' into jb/sdk-1500/implement-ai-judges

6cc3c43

ensure we have a tracker before using it

c682595

fix naming of completion config type

c51d811

cursor bot reviewed Nov 5, 2025

View reviewed changes

pr feedback, cleanup, fixing tests

31a80d6

cursor bot reviewed Nov 5, 2025

View reviewed changes

packages/sdk/server-ai/src/api/chat/TrackedChat.ts Show resolved Hide resolved

judge with original messages

a187fcc

jsonbailey merged commit 6ecd9ab into main Nov 5, 2025
32 checks passed

jsonbailey deleted the jb/sdk-1500/implement-ai-judges branch November 5, 2025 18:30

github-actions bot mentioned this pull request Nov 5, 2025

chore: release main #982

Open

feat!: Add support for real time judge evals #969

feat!: Add support for real time judge evals #969

Uh oh!

Conversation

jsonbailey commented Oct 29, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

tanderson-ld commented Oct 30, 2025

Uh oh!

kinyoklion commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jsonbailey commented Oct 31, 2025

Uh oh!

cursor bot Nov 5, 2025

Choose a reason for hiding this comment

Bug

Uh oh!

jsonbailey Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jsonbailey commented Oct 29, 2025 •

edited by cursor bot

Loading