File tree Expand file tree Collapse file tree 1 file changed +29
-1
lines changed
Expand file tree Collapse file tree 1 file changed +29
-1
lines changed Original file line number Diff line number Diff line change @@ -175,4 +175,32 @@ for metric in efficiency_metrics:
175175
176176### Contributing
177177
178- You're welcome to add LLM models to test in ` server/api/services/llm_services `
178+ #### Adding Evaluation Metrics
179+
180+ To add new evaluation metrics, modify the ` evaluate_response() ` function in ` evaluation/evals.py ` :
181+
182+ ** Update dependencies** in script header and ensure exception handling includes new metrics with ` None ` values.
183+
184+ #### Adding New LLM Models
185+
186+ To add a new LLM model for evaluation, implement a handler in ` server/api/services/llm_services.py ` :
187+
188+ 1 . ** Create a handler class** inheriting from ` BaseModelHandler ` :
189+ 2 . ** Register in ModelFactory** by adding to the ` HANDLERS ` dictionary:
190+ 3 . ** Use in experiments** by referencing the handler key in your experiments CSV:
191+
192+ The evaluation system will automatically use your handler through the Factory Method pattern.
193+
194+
195+ #### Running Tests
196+
197+ The evaluation module includes comprehensive tests for all core functions. Run the test suite using:
198+
199+ ``` sh
200+ uv run test_evals.py
201+ ```
202+
203+ The tests cover:
204+ - ** Cost calculation** with various token usage and pricing scenarios
205+ - ** CSV loading** with validation and error handling
206+ - ** Response evaluation** including async operations and exception handling
You can’t perform that action at this time.
0 commit comments