Skip to content

Supporting More Dataset to Eval Model Reasoning #118

@rootfs

Description

@rootfs

Is your feature request related to a problem? Please describe.
The current router reason bench uses MMLU-Pro for model eval. Since the classifier also uses the same dataset for training, it is more reasonable to eval router's classification accuracy and reasoning setting through other datasts.

Describe the solution you'd like
Build a dataset factory and support datasets like GPAQ, BIG-bench, etc.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions