Skip to content

Releases: embeddings-benchmark/mteb

2.11.6

23 Mar 21:33

Choose a tag to compare

2.11.6 (2026-03-23)

Fix

  • fix: add contributed_by field to TaskMetadata (#4275)

  • feat: add contributed_by field to TaskMetadata (#3920)

Add optional contributed_by field to specify who provided datasets,
especially useful for private datasets where source info is harder
to find. The field is surfaced in dataset card descriptions and the
leaderboard task info table.

  • fix: address PR review feedback for contributed_by field
  • Remove full_description property per Kenneth's feedback
  • Remove contributed_by from leaderboard Task Info tab (add to dataset card instead)
  • Add contributed_by to dataset card template
  • Add contributed_by to docs generation in create_available_tasks.py
  • Annotate RTEB private tasks with contributed_by="Voyage AI"
  • Annotate ViDoRe3 tasks with contributed_by="Illuin Technology" (3f4c00b)

2.11.5

23 Mar 15:45

Choose a tag to compare

2.11.5 (2026-03-23)

Fix

  • fix: ensure that display_on_leaderboard actually reflect whether the benchmark is displayed (#4288)

  • fix: ensure that display_on_leaderboard actually reflect whether the benchmark is displayed

I believe the previous attribute was a leftover from an earlier version of the leaderboard

  • fix typing (d4daab0)

  • fix: Add modality filtering to get_model_metas (#4262)

  • Add modality filtering to get_model_metas

  • Add tests for modality filtering and update typing

  • Add tests for modality filtering

  • Fix lint formatting (newline + spacing)

  • Fix modality tests to use string values

  • Fix lint (final newline)

  • Fix lint + finalize modality filtering

  • Add test for unfiltered models and fix lint

  • Fix final newline

  • lint


Co-authored-by: David Schechter <davidschechter@davids-air-2.lan>
Co-authored-by: David Schechter <davidschechter@Davids-MacBook-Air-2.local>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (3813b2d)

  • fix: fix tasktype aggregation (#4283)

  • fix: remove broken logo

  • fix: vidore leaderboard

  • removed non-diplayed logo
  • fixed issues with display of vidore benchmark

2.11.4

23 Mar 13:17

Choose a tag to compare

2.11.4 (2026-03-23)

Fix

  • fix: Revert gradio version bump (#4287)

After a lot of local testing this seems to resolve the issue (038c4c7)

  • fix: optionally pass embed_dim (#4285)

optionally pass embed_dim (84502b2)

2.11.3

23 Mar 12:20

Choose a tag to compare

2.11.3 (2026-03-23)

Fix

  • fix: don't fetch from hub when calling get_model_meta, but do when calling get_model (#4284)

  • Fix behaviour when getting metadata for non-existing models

  • Fix naming of cross-encoder/ms-marco-TinyBERT-L2-v2 (1c90fcb)

Unknown

  • model: Add potion-base-32M and potion-retrieval-32M and update Model2Vec citation (#4286)

Added potion-base-32m and potion-retrieval-32m (f3e4dfb)

  • Fix display of vidore benchmarks on the leaderboard (#4282)

  • fix: remove broken logo

  • fix: vidore leaderboard

  • removed non-diplayed logo
  • fixed issues with display of vidore benchmark (3149207)

2.11.2

22 Mar 14:47

Choose a tag to compare

2.11.2 (2026-03-22)

Fix

  • fix: make sure that the leaderboard build as intended (#4269)

  • fix: make sure that the leaderboard build as intended

includes the fix from #4268
along with a few additional minor fixes. Notably gradio was to allow for pandas v3 and update makefile to ensure that the leaderboard is run with the correct set of dependencies

  • Fix Gradio style daataframe initalization error on LB (#4268)

Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in> (fd49226)

2.11.1

21 Mar 20:41

Choose a tag to compare

2.11.1 (2026-03-21)

Fix

  • fix: InstructSentenceTransformers embed dim (#4264)

  • fix: InstructSentenceTransformers embed dim

introduced in #4170

  • add to docsting

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (b517688)

  • fix: Remove Memory Usage column and Add active Paramters column (#3979)

  • Update Leaderboard - Remove Memory Usage column and Add active Parameters column

  • Rename columns for better visibility

  • fix active parameter formatting on LB

  • Add exact embedding parameter value for KALM Model (109ef0b)

2.11.0

21 Mar 19:20

Choose a tag to compare

2.11.0 (2026-03-21)

Feature

  • feat: Add Matryoshka support when loading a model (#4170)

  • add Matryoshka support

  • raise error

  • upd embed_dim in leaderboard

  • fix tests

  • fix typcheck

  • add tests

  • upd check

  • add docs (44e5947)

Unknown

Add a Thai language benchmark with 28 tasks spanning 6 task types:

  • BitextMining (6): BibleNLP, Flores, NTREX, Tatoeba, WebFAQ
  • Classification (9): Wisesight, Wongnai, SIB200, MASSIVE, MTOP, etc.
  • Clustering (1): SIB200ClusteringS2S
  • PairClassification (1): XNLI
  • Reranking (2): MIRACL, MultiLongDoc
  • Retrieval (9): MIRACL, BelebeleRetrieval, MKQARetrieval, etc.

Results for 13 models already merged: embeddings-benchmark/results#428

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

  • Update mteb/benchmarks/benchmarks/benchmarks.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Update benchmarks.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Curate Thai benchmark: remove cross-lingual and low-quality tasks

Address review feedback from @KennethEnevoldsen:

Removed (12 tasks):

  • All 6 bitext mining tasks — cross-lingual by design, not monolingual Thai
  • LanguageClassification — trivial LID, off-topic for monolingual benchmark
  • MassiveIntentClassification, MassiveScenarioClassification — machine-translated
  • MultilingualSentimentClassification — unclear Thai provenance
  • WongnaiReviewsClassification — redundant sentiment (keep Wisesight as best)

Kept (15 tasks):

  • Classification: MTOP (purpose-built for Thai), SIB200, Wisesight (native Thai)
  • Clustering: SIB200ClusteringS2S
  • PairClassification: XNLI (human-translated)
  • Reranking: MIRACL (human-judged), MultiLongDoc
  • Retrieval: MIRACL, Belebele, MKQA, MrTidy, MultiLongDoc, WebFAQ, XQuAD

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

  • Add contacts field for benchmark maintainer

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>


Co-authored-by: anusoft <anu@anusoft.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (03dcfc8)

  • model: Add NanoVDR-S-Multi with custom AbsEncoder for asymmetric VDR (#4242)

  • Add model meta for nanovdr/NanoVDR-S-Multi

  • add n_embedding_parameters

  • Implement NanoVDRWrapper as custom AbsEncoder with asymmetric routing

Query encoding uses the lightweight NanoVDR-S-Multi student (69M, text-only).
Document encoding uses the frozen Qwen3-VL-Embedding-2B teacher (2B, VLM).
Teacher is lazy-loaded only when document encoding is needed.

  • Fix ruff lint and formatting

  • Default to student encoder for non-retrieval tasks

  • Add error for unsupported image-query tasks

  • Remove trust_remote_code=True (model class built locally by MTEB)

  • Apply suggestion from @Samoed

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>


Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (2e1e513)

2.10.14

19 Mar 20:33

Choose a tag to compare

2.10.14 (2026-03-19)

Documentation

  • docs: fix formatting in the documentation (#4258) (7ff0dea)

Fix

  • fix: Added Embedding Parameter Value for Image and Audio Models (#4167)

  • Added Embedding Parameter Value for Image and Audio Models

  • Remove models from _MISSING_N_EMBEDDING_MODELS

  • Fix tests

  • Remove comment from CLAP models

  • Added embedding parameter value=0 for only image/audio models

  • update tests and modalities (14f33b5)

Unknown

  • Fix ColQwen3.5-4.5B - ColPaliEngineWrapper.encode() for multimodal datasets (#4245)

  • fix: improve input encoding logic for ColPali models to handle text and image features correctly

  • fix: refactor encoding logic in ColPali and ColQwen models to improve handling of text and image features

  • fix: refactor ColQwen3.5 wrapper to enhance input handling and support for image-text embeddings

  • fix: linting

  • Update mteb/models/model_implementations/colqwen_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • fix: image processing logic in ColQwen3.5 wrapper to match how

  • Update mteb/models/model_implementations/colqwen_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • fix: use process_queries instead of process_texts for text encoding

process_queries adds the query prefix and augmentation tokens that the
model was trained with. process_texts is plain tokenization without these,
leading to ~0.02-0.03 lower nDCG scores.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>


Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> (8d80cdd)

2.10.13

18 Mar 17:05

Choose a tag to compare

2.10.13 (2026-03-18)

Fix

  • fix: remove unused calculate_probs method from model wrappers (#4251)

fix: remove unused calculate_probs method from model wrappers

Closes #4201 (c28635b)

Unknown

  • detect frameworks from HuggingFace tags instead of hardcoding PyTorch (#4252)

feat: detect frameworks from HuggingFace tags instead of hardcoding PyTorch

Expand _get_frameworks_from_hf_tags to detect pytorch, tf, jax, and
openvino from HuggingFace model tags instead of assuming PyTorch by
default. Add JAX and OpenVINO to the FRAMEWORKS literal type.

Closes #4104 (1ea8628)

  • fix colqwen3.5 (#4243) (c6b9389)

  • Model: Add athrael-soju/ColQwen3.5-v3 (#4241)

  • feat: add ColQwen3.5 model wrapper and metadata

  • Update mteb/models/model_implementations/colqwen_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • feat: update max_tokens for ColQwen3.5 v3 model metadata

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (f0f10c2)

  • dataset: Add MMDocIR (#4230)

  • dataset: Add MMDocIR

  • change dataset path and revision (4f1688f)

2.10.12

14 Mar 14:35

Choose a tag to compare

2.10.12 (2026-03-14)

Documentation

  • docs: remove mieb and mmteb contribution docs (#4227)

I don't think we maintain these anymore. I think they are fine to remove (823236c)

  • docs: fix docs paths (#4224)

fix docs paths (0d07d33)

  • docs: fix naming on contributing docs (6787a17)

Fix

  • fix: metadata getting computed for existing MTEB model (#4231)

  • Fix behaviour while getting metadata of existing MTEB model

  • Added basic metadata in overwrite

  • Updated CrossEncoderWrapper with same changes (973a5a1)

Unknown

  • Model: Add new model revision of Querit/Querit (#4215)

New model revision (f913ed8)

  • Fix zeroentropy/zembed-1 metadata (#4233)

Fix zeroentropy/zembed-1 metadata (revision, release_date, max_tokens)

The metadata added in #4202 had incorrect values for three fields:

  • revision: pointed to wrong HuggingFace commit
  • release_date: was "2025-09-16", should be "2026-03-02"
  • max_tokens: was 40960, should be 32768 (3cd67fd)
  • Add Zeroentropy models (#4228)

  • Add Zeroentropy models

  • correct metadata

  • Correct loader_kwargs for rerankers (791a185)

  • model: nvidia/llama-nemotron-embed-vl-1b-v2 for ViDoRe (#4192)

  • Adds nvidia/llama-nemotron-embed-vl-1b-v2 model

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Fixing tests and linting issues

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Nemotron Embed VL 1B: Setting the number of tiles an image can be split

  • Fixing lint issue

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

  • Disabling image modality by default

  • Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

  • Setting default modality of Nemotron Embed VL 1B as image + text (when available)

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (2a8c2d3)

  • dataset: Add IRPAPERS (#4225)

  • dataset: Add IRPAPERS

  • change dataset path

  • dataset transform

  • remove samples without text

  • add t2i category

  • delete former stats and remove column (abd5048)

  • Model: Add F2LLM-v2 (#4222)

  • Add f2llm-v2

  • lint codefuse models

  • Fix error in prompt (42c0d51)