[EVAL] MultiChallenge

## Evaluation short description
Aside from Multi-IF, there are very few multi-turn instruction following evals. MultiChallenge is a hard version that OpenAI and others report in their model cards.

## Evaluation metadata
Provide all available
- Paper url: https://arxiv.org/abs/2501.17399
- Github url: https://github.com/ekwinox117/multi-challenge
- Dataset url: https://github.com/ekwinox117/multi-challenge/tree/main/data


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EVAL] MultiChallenge #1019

Evaluation short description

Evaluation metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[EVAL] MultiChallenge #1019

Description

Evaluation short description

Evaluation metadata

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions