Commit fc31867
authored
chore: allows evaluator to run on existing predictions (#734)
# Motivation
Allows evaluation to be run on an existing predictions jsonl file.
# Content
- modified logic that loads predictions to check for a consolidated
jsonl file before creating one.
# Testing
Tested by running locally
# Please check the following before marking your PR as ready for review
- [x] I have added tests for my changes
- [X] I have updated the documentation or added new documentation as
needed1 parent e23dac4 commit fc31867
File tree
2 files changed
+20
-20
lines changed- codegen-examples/examples/swebench_agent_run
- src/codegen/extensions/swebench
2 files changed
+20
-20
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
278 | 278 | | |
279 | 279 | | |
280 | 280 | | |
281 | | - | |
| 281 | + | |
282 | 282 | | |
283 | | - | |
284 | 283 | | |
285 | 284 | | |
286 | 285 | | |
| 286 | + | |
| 287 | + | |
287 | 288 | | |
288 | | - | |
289 | 289 | | |
290 | 290 | | |
291 | 291 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
| 116 | + | |
| 117 | + | |
116 | 118 | | |
117 | 119 | | |
118 | 120 | | |
| |||
126 | 128 | | |
127 | 129 | | |
128 | 130 | | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
129 | 135 | | |
130 | | - | |
| 136 | + | |
131 | 137 | | |
132 | | - | |
133 | | - | |
134 | 138 | | |
135 | | - | |
136 | 139 | | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
141 | 144 | | |
142 | | - | |
143 | | - | |
| 145 | + | |
| 146 | + | |
144 | 147 | | |
145 | | - | |
146 | | - | |
| 148 | + | |
| 149 | + | |
147 | 150 | | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
| 151 | + | |
| 152 | + | |
153 | 153 | | |
154 | 154 | | |
0 commit comments