You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-1Lines changed: 15 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -214,4 +214,18 @@ The evaluation result contains additional scoring fields:
214
214
-`key_validation_score`: Score from validating expected keys in JSON output (for non-renderable outputs)
215
215
-`raw_output_eval`: Array of boolean values indicating whether each raw output metric was satisfied
216
216
-`raw_output_score`: Score from the raw output evaluation
217
-
-`final_eval_score`: Overall evaluation score between 0 and 1
217
+
-`final_eval_score`: Overall evaluation score between 0 and 1
218
+
219
+
## Citation
220
+
Please cite us with the following bibtex:
221
+
```
222
+
@misc{yang2025structeval,
223
+
title={StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs},
224
+
author={Jialin Yang and Dongfu Jiang and Lipeng He and Sherman Siu and Yuxuan Zhang and Disen Liao and Zhuofeng Li and Huaye Zeng and Yiming Jia and Haozhe Wang and Benjamin Schneider and Chi Ruan and Wentao Ma and Zhiheng Lyu and Yifei Wang and Yi Lu and Quy Duc Do and Ziyan Jiang and Ping Nie and Wenhu Chen},
0 commit comments