You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -107,7 +107,7 @@ Here's a sample benchmark output:
107
107
2.**benchmark_results.csv**: CSV file suitable for plotting in Excel or other tools
108
108
3.**sequential_benchmark_results.json**: Results from the sequential commands example
109
109
110
-
When using the LLM benchmark (`examples/02_benchmarks/performance_testing/benchmark.py`), additional files are generated in `benchmark_output/`:
110
+
When using the canonical LLM benchmark (`benchmarks/llm_benchmark.py`), additional files are generated in `benchmark_output/`:
111
111
112
112
1.**benchmark_results.json**: Raw per-query results + aggregated accuracy stats
113
113
2.**benchmark_results.html**: Interactive charts
@@ -152,7 +152,7 @@ Based on benchmark results, you can:
152
152
153
153
### Adding New Commands
154
154
155
-
Edit `examples/02_benchmarks/performance_testing/benchmark.py` and modify the command lists:
155
+
Edit `benchmarks/llm_benchmark.py` for the canonical benchmark, or `examples/02_benchmarks/performance_testing/benchmark.py` for the lightweight adapter example, and modify the command lists:
156
156
157
157
```python
158
158
commands = {
@@ -210,7 +210,7 @@ Add benchmarking to your CI pipeline:
0 commit comments