-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Test AI agents
- Question bank with answer rubric. Ask Same question 3 times to the model.
- Have separate QA models to evaluate the answers.
- Check for,
- Average accuracy wrt the rubric (QA model)
- Similarity across 3 questions (QA model)
- Adherance to the format (QA model)
- Consistency of facts / claims (QA model)
Other things to do;
- Set temperature to low
- Use multiple QA models so that bias is eliminated
Metadata
Metadata
Assignees
Labels
No labels