Skip to content

Conversation

@silky1708
Copy link
Member

  • I have outlined why this dataset is filling an existing gap in mteb
  • I have tested that the dataset runs with the mteb package.
  • I have run the following models on the task (adding the results to the pr). These can be run using the mteb run -m {model_name} -t {task_name} command.
    • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    • intfloat/multilingual-e5-small
  • I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
  • I have considered the size of the dataset and reduced it if it is too big (2048 examples is typically large enough for most tasks)

If you add a model or a dataset, please add the corresponding checklist:

@silky1708 silky1708 self-assigned this Nov 6, 2025
@silky1708 silky1708 added new dataset Issues related to adding a new task or dataset maeb Audio extension labels Nov 6, 2025
@KennethEnevoldsen KennethEnevoldsen marked this pull request as draft November 6, 2025 08:15
}


class GoogleSVQA2TRetrieval(AbsTaskAny2AnyRetrieval):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #3528 you need to inherit from AbsTaskRetrieval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maeb Audio extension new dataset Issues related to adding a new task or dataset

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants