Releases: embeddings-benchmark/mteb
2.11.6
2.11.6 (2026-03-23)
Fix
-
fix: add contributed_by field to TaskMetadata (#4275)
-
feat: add contributed_by field to TaskMetadata (#3920)
Add optional contributed_by field to specify who provided datasets,
especially useful for private datasets where source info is harder
to find. The field is surfaced in dataset card descriptions and the
leaderboard task info table.
- fix: address PR review feedback for contributed_by field
- Remove full_description property per Kenneth's feedback
- Remove contributed_by from leaderboard Task Info tab (add to dataset card instead)
- Add contributed_by to dataset card template
- Add contributed_by to docs generation in create_available_tasks.py
- Annotate RTEB private tasks with contributed_by="Voyage AI"
- Annotate ViDoRe3 tasks with contributed_by="Illuin Technology" (
3f4c00b)
2.11.5
2.11.5 (2026-03-23)
Fix
-
fix: ensure that
display_on_leaderboardactually reflect whether the benchmark is displayed (#4288) -
fix: ensure that
display_on_leaderboardactually reflect whether the benchmark is displayed
I believe the previous attribute was a leftover from an earlier version of the leaderboard
-
fix typing (
d4daab0) -
fix: Add modality filtering to get_model_metas (#4262)
-
Add modality filtering to get_model_metas
-
Add tests for modality filtering and update typing
-
Add tests for modality filtering
-
Fix lint formatting (newline + spacing)
-
Fix modality tests to use string values
-
Fix lint (final newline)
-
Fix lint + finalize modality filtering
-
Add test for unfiltered models and fix lint
-
Fix final newline
-
lint
Co-authored-by: David Schechter <davidschechter@davids-air-2.lan>
Co-authored-by: David Schechter <davidschechter@Davids-MacBook-Air-2.local>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (3813b2d)
-
fix: fix tasktype aggregation (#4283)
-
fix: remove broken logo
-
fix: vidore leaderboard
- removed non-diplayed logo
- fixed issues with display of vidore benchmark
2.11.4
2.11.3
2.11.3 (2026-03-23)
Fix
-
fix: don't fetch from hub when calling
get_model_meta, but do when callingget_model(#4284) -
Fix behaviour when getting metadata for non-existing models
-
Fix naming of cross-encoder/ms-marco-TinyBERT-L2-v2 (
1c90fcb)
Unknown
- model: Add potion-base-32M and potion-retrieval-32M and update Model2Vec citation (#4286)
Added potion-base-32m and potion-retrieval-32m (f3e4dfb)
-
Fix display of vidore benchmarks on the leaderboard (#4282)
-
fix: remove broken logo
-
fix: vidore leaderboard
- removed non-diplayed logo
- fixed issues with display of vidore benchmark (
3149207)
2.11.2
2.11.2 (2026-03-22)
Fix
-
fix: make sure that the leaderboard build as intended (#4269)
-
fix: make sure that the leaderboard build as intended
includes the fix from #4268
along with a few additional minor fixes. Notably gradio was to allow for pandas v3 and update makefile to ensure that the leaderboard is run with the correct set of dependencies
- Fix Gradio style daataframe initalization error on LB (#4268)
Co-authored-by: Munot Ayush Sunil <munotayush6@kgpian.iitkgp.ac.in> (fd49226)
2.11.1
2.11.1 (2026-03-21)
Fix
-
fix: InstructSentenceTransformers embed dim (#4264)
-
fix: InstructSentenceTransformers embed dim
introduced in #4170
- add to docsting
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (b517688)
2.11.0
2.11.0 (2026-03-21)
Feature
-
feat: Add Matryoshka support when loading a model (#4170)
-
add Matryoshka support
-
raise error
-
upd embed_dim in leaderboard
-
fix tests
-
fix typcheck
-
add tests
-
upd check
-
add docs (
44e5947)
Unknown
-
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb (
9a2dbd8) -
benchmark: Add Thai benchmark MTEB(tha, v1) (#4213)
-
Add Thai benchmark: MTEB(tha, v1)
Add a Thai language benchmark with 28 tasks spanning 6 task types:
- BitextMining (6): BibleNLP, Flores, NTREX, Tatoeba, WebFAQ
- Classification (9): Wisesight, Wongnai, SIB200, MASSIVE, MTOP, etc.
- Clustering (1): SIB200ClusteringS2S
- PairClassification (1): XNLI
- Reranking (2): MIRACL, MultiLongDoc
- Retrieval (9): MIRACL, BelebeleRetrieval, MKQARetrieval, etc.
Results for 13 models already merged: embeddings-benchmark/results#428
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update mteb/benchmarks/benchmarks/benchmarks.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Update benchmarks.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Curate Thai benchmark: remove cross-lingual and low-quality tasks
Address review feedback from @KennethEnevoldsen:
Removed (12 tasks):
- All 6 bitext mining tasks — cross-lingual by design, not monolingual Thai
- LanguageClassification — trivial LID, off-topic for monolingual benchmark
- MassiveIntentClassification, MassiveScenarioClassification — machine-translated
- MultilingualSentimentClassification — unclear Thai provenance
- WongnaiReviewsClassification — redundant sentiment (keep Wisesight as best)
Kept (15 tasks):
- Classification: MTOP (purpose-built for Thai), SIB200, Wisesight (native Thai)
- Clustering: SIB200ClusteringS2S
- PairClassification: XNLI (human-translated)
- Reranking: MIRACL (human-judged), MultiLongDoc
- Retrieval: MIRACL, Belebele, MKQA, MrTidy, MultiLongDoc, WebFAQ, XQuAD
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add contacts field for benchmark maintainer
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: anusoft <anu@anusoft.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (03dcfc8)
-
model: Add NanoVDR-S-Multi with custom AbsEncoder for asymmetric VDR (#4242)
-
Add model meta for nanovdr/NanoVDR-S-Multi
-
add n_embedding_parameters
-
Implement NanoVDRWrapper as custom AbsEncoder with asymmetric routing
Query encoding uses the lightweight NanoVDR-S-Multi student (69M, text-only).
Document encoding uses the frozen Qwen3-VL-Embedding-2B teacher (2B, VLM).
Teacher is lazy-loaded only when document encoding is needed.
-
Fix ruff lint and formatting
-
Default to student encoder for non-retrieval tasks
-
Add error for unsupported image-query tasks
-
Remove trust_remote_code=True (model class built locally by MTEB)
-
Apply suggestion from @Samoed
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (2e1e513)
2.10.14
2.10.14 (2026-03-19)
Documentation
Fix
-
fix: Added Embedding Parameter Value for Image and Audio Models (#4167)
-
Added Embedding Parameter Value for Image and Audio Models
-
Remove models from _MISSING_N_EMBEDDING_MODELS
-
Fix tests
-
Remove comment from CLAP models
-
Added embedding parameter value=0 for only image/audio models
-
update tests and modalities (
14f33b5)
Unknown
-
Fix ColQwen3.5-4.5B - ColPaliEngineWrapper.encode() for multimodal datasets (#4245)
-
fix: improve input encoding logic for ColPali models to handle text and image features correctly
-
fix: refactor encoding logic in ColPali and ColQwen models to improve handling of text and image features
-
fix: refactor ColQwen3.5 wrapper to enhance input handling and support for image-text embeddings
-
fix: linting
-
Update mteb/models/model_implementations/colqwen_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
-
fix: image processing logic in ColQwen3.5 wrapper to match how
-
Update mteb/models/model_implementations/colqwen_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- fix: use process_queries instead of process_texts for text encoding
process_queries adds the query prefix and augmentation tokens that the
model was trained with. process_texts is plain tokenization without these,
leading to ~0.02-0.03 lower nDCG scores.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> (8d80cdd)
2.10.13
2.10.13 (2026-03-18)
Fix
- fix: remove unused calculate_probs method from model wrappers (#4251)
fix: remove unused calculate_probs method from model wrappers
Unknown
- detect frameworks from HuggingFace tags instead of hardcoding PyTorch (#4252)
feat: detect frameworks from HuggingFace tags instead of hardcoding PyTorch
Expand _get_frameworks_from_hf_tags to detect pytorch, tf, jax, and
openvino from HuggingFace model tags instead of assuming PyTorch by
default. Add JAX and OpenVINO to the FRAMEWORKS literal type.
-
Model: Add athrael-soju/ColQwen3.5-v3 (#4241)
-
feat: add ColQwen3.5 model wrapper and metadata
-
Update mteb/models/model_implementations/colqwen_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- feat: update max_tokens for ColQwen3.5 v3 model metadata
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (f0f10c2)
2.10.12
2.10.12 (2026-03-14)
Documentation
- docs: remove mieb and mmteb contribution docs (#4227)
I don't think we maintain these anymore. I think they are fine to remove (823236c)
- docs: fix docs paths (#4224)
fix docs paths (0d07d33)
- docs: fix naming on contributing docs (
6787a17)
Fix
-
fix: metadata getting computed for existing MTEB model (#4231)
-
Fix behaviour while getting metadata of existing MTEB model
-
Added basic metadata in overwrite
-
Updated CrossEncoderWrapper with same changes (
973a5a1)
Unknown
- Model: Add new model revision of Querit/Querit (#4215)
New model revision (f913ed8)
- Fix zeroentropy/zembed-1 metadata (#4233)
Fix zeroentropy/zembed-1 metadata (revision, release_date, max_tokens)
The metadata added in #4202 had incorrect values for three fields:
- revision: pointed to wrong HuggingFace commit
- release_date: was "2025-09-16", should be "2026-03-02"
- max_tokens: was 40960, should be 32768 (
3cd67fd)
-
Add Zeroentropy models (#4228)
-
Add Zeroentropy models
-
correct metadata
-
Correct loader_kwargs for rerankers (
791a185) -
model: nvidia/llama-nemotron-embed-vl-1b-v2 for ViDoRe (#4192)
-
Adds nvidia/llama-nemotron-embed-vl-1b-v2 model
-
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
-
Fixing tests and linting issues
-
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
-
Nemotron Embed VL 1B: Setting the number of tiles an image can be split
-
Fixing lint issue
-
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
-
Disabling image modality by default
-
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
- Setting default modality of Nemotron Embed VL 1B as image + text (when available)
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (2a8c2d3)