-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
I'm thinking about ways to enable traditional data capture workflows. Gold standards are a super common option. What if we created another utility rate_gold function? It takes in a dict input->correct answer as an argument, and then the utility function looks to see if any inputs from the tasks match the input, and if so check if the response matches a gold. It then takes sum(correct) and sends that to the rate() function.
Metadata
Metadata
Assignees
Labels
No labels