We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 8aea8dd commit 517af96Copy full SHA for 517af96
README.md
@@ -6,8 +6,7 @@ _For the implementation used in the paper [SWE-Search: Enhancing Software Agents
6
## SWE-Bench
7
I use the [SWE-bench benchmark](https://www.swebench.com/) as a way to verify my ideas.
8
9
-* [Claude 3.5 Sonnet v20241022 evaluation results](https://experiments.moatless.ai/evaluations/20250113_claude_3_5_sonnet_20241022_temp_0_0_iter_20_fmt_tool_call_hist_messages_lite) - 39% solve rate, 2.7 resolved instances per dollar
10
-* [Deepseek V3](https://experiments.moatless.ai/evaluations/20250111_deepseek_chat_v3_temp_0_0_iter_20_fmt_react_hist_react) - 30.7% solve rate, 24 resolved instances per dollar
+* Claude 4 Sonnet - 70.8% solve rate, $0.63 per instance.
11
12
# Try it out
13
0 commit comments