Skip to content

Commit 49fee5b

Browse files
committed
Update eval doc
1 parent f55e103 commit 49fee5b

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

docs/evals.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ The evals tool is included with the TypeScript package. No additional dependenci
3535
| `--azure-api-version` || Azure OpenAI API version (default: 2025-01-01-preview) |
3636
| `--models` || Models for benchmark mode (benchmark only) |
3737
| `--latency-iterations` || Latency test samples (default: 25) (benchmark only) |
38+
| `--max-parallel-models` || Maximum concurrent models in benchmark mode (default: CPU count) (benchmark only) |
39+
| `--benchmark-chunk-size` || Sample chunk size per model for memory-efficient benchmarking (benchmark only) |
3840

3941
## Configuration
4042

@@ -154,6 +156,8 @@ npm run eval -- --config-path config.json --dataset-path data.jsonl --base-url h
154156
- **Multi-stage evaluation**: pre_flight, input, output stages
155157
- **Automatic stage detection**: Evaluates all stages found in configuration
156158
- **Batch processing**: Configurable parallel processing
159+
- **Parallel benchmarking**: Run multiple models concurrently with CPU-aware defaults
160+
- **Memory-efficient chunking**: Process large datasets in smaller chunks during benchmarking
157161
- **Benchmark mode**: Model performance comparison with ROC AUC, precision at recall thresholds
158162
- **Latency testing**: End-to-end guardrail performance measurement
159163
- **Visualization**: Automatic chart and graph generation

0 commit comments

Comments
 (0)