Skip to content

Commit ad75f38

Browse files
all-hands-botopenhands-agentjuanmichelini
authored
Add swt-bench results for glm-4.7 (#532)
Co-authored-by: openhands <[email protected]> Co-authored-by: Juan Michelini <[email protected]>
1 parent 1ecaa0a commit ad75f38

File tree

2 files changed

+14
-1
lines changed

2 files changed

+14
-1
lines changed

results/glm-4.7/metadata.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"openness": "open_weights",
66
"country": "cn",
77
"tool_usage": "standard",
8-
"submission_time": "2026-02-09T17:50:43.756380+00:00",
8+
"submission_time": "2026-02-09T17:50:40.185099+00:00",
99
"directory_name": "glm-4.7",
1010
"release_date": "2025-12-22",
1111
"parameter_count_b": 355,

results/glm-4.7/scores.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,19 @@
1313
"submission_time": "2026-01-30T12:51:32.483444+00:00"
1414
},
1515
{
16+
"benchmark": "swt-bench",
17+
"score": 49.4,
18+
"metric": "accuracy",
19+
"cost_per_instance": 0.37,
20+
"average_runtime": 744.0,
21+
"full_archive": "https://results.eval.all-hands.dev/swtbench/litellm_proxy-openrouter-z-ai-glm-4-7/21548136286/results.tar.gz",
22+
"tags": [
23+
"swt-bench"
24+
],
25+
"agent_version": "v1.10.0",
26+
"submission_time": "2026-02-01T04:32:01+00:00"
27+
},
28+
{
1629
"benchmark": "gaia",
1730
"score": 53.9,
1831
"metric": "accuracy",

0 commit comments

Comments
 (0)