Commit bd6d396
committed
refactor: remove json-schema scorer from eval testing
- Remove json-schema scorer type from ResponseScorer interface
- Update test-config.json schema to only support regex and llm-judge scorers
- Remove runJsonSchemaScorer method and related logic from EvalTestRunner
- Replace any types with proper TypeScript types (CoreMessage, ToolCall, DisplayManager)
- Add ScorerResult interface for better type safety
- Update schema validation tests to reflect new scorer types
JSON schema validation doesn't serve a realistic use case for LLM response evaluation since responses should be natural language, not structured data requiring schema validation.1 parent 6c1c02c commit bd6d396
File tree
4 files changed
+25
-66
lines changed- src
- core
- schemas
- testing/evals
- test/unit
4 files changed
+25
-66
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
6 | 8 | | |
7 | 9 | | |
8 | 10 | | |
| |||
136 | 138 | | |
137 | 139 | | |
138 | 140 | | |
139 | | - | |
| 141 | + | |
140 | 142 | | |
141 | | - | |
142 | 143 | | |
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
146 | 154 | | |
147 | 155 | | |
148 | 156 | | |
149 | 157 | | |
150 | 158 | | |
151 | | - | |
152 | | - | |
| 159 | + | |
| 160 | + | |
153 | 161 | | |
154 | 162 | | |
155 | 163 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
226 | 226 | | |
227 | 227 | | |
228 | 228 | | |
229 | | - | |
| 229 | + | |
230 | 230 | | |
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | 236 | | |
240 | 237 | | |
241 | 238 | | |
| |||
257 | 254 | | |
258 | 255 | | |
259 | 256 | | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | 257 | | |
269 | 258 | | |
270 | 259 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
| |||
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
37 | | - | |
| 39 | + | |
38 | 40 | | |
39 | | - | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
40 | 46 | | |
41 | 47 | | |
42 | 48 | | |
| |||
213 | 219 | | |
214 | 220 | | |
215 | 221 | | |
216 | | - | |
| 222 | + | |
217 | 223 | | |
218 | 224 | | |
219 | 225 | | |
| |||
249 | 255 | | |
250 | 256 | | |
251 | 257 | | |
252 | | - | |
| 258 | + | |
253 | 259 | | |
254 | 260 | | |
255 | 261 | | |
| |||
272 | 278 | | |
273 | 279 | | |
274 | 280 | | |
275 | | - | |
276 | | - | |
277 | | - | |
278 | | - | |
279 | | - | |
280 | 281 | | |
281 | 282 | | |
282 | 283 | | |
| |||
288 | 289 | | |
289 | 290 | | |
290 | 291 | | |
291 | | - | |
| 292 | + | |
292 | 293 | | |
293 | 294 | | |
294 | 295 | | |
| |||
304 | 305 | | |
305 | 306 | | |
306 | 307 | | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | | - | |
311 | | - | |
312 | | - | |
313 | | - | |
314 | | - | |
315 | | - | |
316 | | - | |
317 | | - | |
318 | | - | |
319 | | - | |
320 | | - | |
321 | | - | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | | - | |
335 | | - | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | 308 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
| 65 | + | |
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
| |||
0 commit comments