Open source evaluation code?

Thank you for this detailed survey of LLM judges! 

My team is starting a project to build a unified framework for LLM judging and would like a way to automatically evaluate the human agreement and bias of each method. Would you consider open sourcing the code you used to get the results in section 5 of your paper? 

We would also require a permissive license like MIT so we can use the code without copyright issues. That would make things much easier for our project :) 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Open source evaluation code? #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Open source evaluation code? #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions