fix: Error in siglip output conversion by KennethEnevoldsen · Pull Request #4205 · embeddings-benchmark/mteb

KennethEnevoldsen · 2026-03-06T09:45:24Z

Discovered when running for #4193

mteb/models/model_implementations/siglip_models.py

Samoed

Not sure why my comments are not displayed in here, only in file changes.

Seems that siglip uses lasttokenpooling, not mean pooling https://github.com/huggingface/transformers/blob/aad13b87ed59f2afcfaebc985f403301887a35fc/src/transformers/models/siglip/modeling_siglip.py#L523-L525

>>> torch.mean(
                    text_outputs["last_hidden_state"], dim=1
                )
tensor([[-0.3865, -0.5087,  0.1831,  ...,  0.8009,  0.1759, -0.5749],
        [-0.3301, -0.8056,  0.2536,  ...,  0.7427, -0.2328, -0.5802],
        [-0.2553, -1.1079,  0.4744,  ...,  0.0874, -0.0792, -0.6855],
        ...,
        [-0.4886, -0.6445,  0.0385,  ...,  0.3665, -0.1208, -0.5695],
        [ 0.0374, -0.6736,  0.1148,  ...,  0.1854,  0.2513, -0.7220],
        [-0.0087,  0.3741, -0.0740,  ...,  0.1965, -0.2676, -0.6060]])
>>> text_outputs.pooler_output
tensor([[ 1.0248,  0.2559,  0.1031,  ...,  0.1529, -0.5528,  0.2910],
        [ 0.7734, -0.6904, -0.7753,  ..., -0.5422,  0.6960, -0.1439],
        [ 0.4678, -0.2966, -0.2353,  ...,  0.1854, -0.6198,  0.6368],
        ...,
        [ 0.0607,  0.5256, -0.2899,  ..., -0.9362,  1.1828, -0.0575],
        [ 0.0698, -0.8170,  0.1015,  ..., -0.1641,  0.2055,  0.2031],
        [ 0.5874, -0.5110, -0.3075,  ...,  0.4028,  0.3083, -0.3046]])

mteb/models/model_implementations/siglip_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

KennethEnevoldsen · 2026-03-06T12:24:07Z

Good catch - let me run some comparisons to check

KennethEnevoldsen · 2026-03-06T13:02:04Z

Yeah so clearly better, but still bad:

Vidore3FinanceEnRetrieval.v2.json
current: 0.03747,
mean pooling: 0.00119,

Though this models isn't really designed for this so very unsure if we would expect it to be any better.

Samoed · 2026-03-06T13:26:01Z

I don't think that results will be better

Samoed · 2026-03-06T14:43:54Z

You can also try to reprocude scores from mieb

isaac-chung · 2026-03-07T09:03:13Z

I can't seem to find what the error is, and how to reproduce it?

Samoed · 2026-03-07T09:44:11Z

I tried on main mteb run -m google/siglip-base-patch16-384 -t VidoreArxivQARetrieval

  File "/mteb/mteb/models/model_implementations/siglip_models.py", line 71, in get_text_embeddings
    all_text_embeddings.append(text_outputs.cpu())
                               ^^^^^^^^^^^^^^^^
AttributeError: 'BaseModelOutputWithPooling' object has no attribute 'cpu'

isaac-chung · 2026-03-07T09:46:06Z

When I ran that, I got an ModuleNotFoundError for sentence piece, and then for protobuf

isaac-chung · 2026-03-07T10:54:19Z

Ok, got the same AttributeError when i run

uv run --no-sync --extra siglip mteb run -t MSCOCOT2IRetrieval -m google/siglip-so400m-patch14-224

but when I run with MSCOCOI2TRetrieval, I get

ValueError: Found None in batch for key 'text'

Samoed · 2026-03-07T11:02:30Z

I think this error not related to siglip. We just handle this task incorrectly

KennethEnevoldsen · 2026-03-07T13:09:42Z

I ran it on CIFAR10

>>> res[0].get_score()
np.float64(0.9693000000000002)

notably better than what is reported in the paper: 83.79. I then also checked for STS12VisualSTS, which reports 61.90 and matches my score:

>>> res[0].get_score()
np.float64(0.618997279216088)

So I think the implementation is correct, and that we might have had some issues in earlier versions either of the task or the model.

@isaac-chung also resolved the issue with missing dependencies.

isaac-chung · 2026-03-07T13:19:51Z

Looks good! Only thing left is likely the dependency conflicts then

This should happen here: https://github.com/embeddings-benchmark/mteb/blob/ce7590dcc9c620450ca192a3ec101a62631e6b55/mteb/_create_dataloaders.py#L291-L292 Not sure why it is needed

KennethEnevoldsen · 2026-03-07T13:46:04Z

Looks good! Only thing left is likely the dependency conflicts then

added a fix.

I also found another problem when running it on Caltech101ZeroShot - added a fix for that as well

mteb/models/model_implementations/siglip_models.py

…nto fix-siglip

KennethEnevoldsen · 2026-03-11T09:39:37Z

I think this is good to merge with the fixes

fix: Error in siglib output conversion

911d2da

KennethEnevoldsen requested a review from isaac-chung March 6, 2026 09:45

KennethEnevoldsen enabled auto-merge (squash) March 6, 2026 09:45

Samoed disabled auto-merge March 6, 2026 09:55

Samoed reviewed Mar 6, 2026

View reviewed changes

mteb/models/model_implementations/siglip_models.py Outdated Show resolved Hide resolved

Your Name added 2 commits March 6, 2026 11:21

add mean pool to siglip

d15a020

format

a3b51c6

Samoed reviewed Mar 6, 2026

View reviewed changes

mteb/models/model_implementations/siglip_models.py Outdated Show resolved Hide resolved

Samoed reviewed Mar 6, 2026

View reviewed changes

KennethEnevoldsen commented Mar 6, 2026

View reviewed changes

mteb/models/model_implementations/siglip_models.py Outdated Show resolved Hide resolved

Apply suggestions from code review

4672f39

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

isaac-chung changed the title ~~fix: Error in siglib output conversion~~ fix: Error in siglip output conversion Mar 7, 2026

KennethEnevoldsen added 2 commits March 7, 2026 13:57

add missing depencies

f8129b6

added fix for siglip dependencies

80af766

format

0eb0ea0

KennethEnevoldsen added 2 commits March 7, 2026 14:26

fix dependencies

ce7590d

added image normalization

d275608

This should happen here: https://github.com/embeddings-benchmark/mteb/blob/ce7590dcc9c620450ca192a3ec101a62631e6b55/mteb/_create_dataloaders.py#L291-L292 Not sure why it is needed

KennethEnevoldsen commented Mar 7, 2026

View reviewed changes

mteb/models/model_implementations/siglip_models.py Outdated Show resolved Hide resolved

KennethEnevoldsen added 3 commits March 7, 2026 15:15

relax protobuf dependency

798bd4c

lint

6383940

update pyproject.toml dependencies

3e6e902

KennethEnevoldsen mentioned this pull request Mar 9, 2026

Add vidore results embeddings-benchmark/results#436

Open

Merge branch 'main' of https://github.com/embeddings-benchmark/mteb i…

69822ed

…nto fix-siglip

KennethEnevoldsen enabled auto-merge (squash) March 11, 2026 09:39

KennethEnevoldsen merged commit ec20d1e into main Mar 11, 2026
13 checks passed

KennethEnevoldsen deleted the fix-siglip branch March 11, 2026 09:50

Conversation

KennethEnevoldsen commented Mar 6, 2026

Uh oh!

Uh oh!

Uh oh!

Samoed left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

KennethEnevoldsen commented Mar 6, 2026

Uh oh!

KennethEnevoldsen commented Mar 6, 2026

Uh oh!

Samoed commented Mar 6, 2026

Uh oh!

Samoed commented Mar 6, 2026

Uh oh!

isaac-chung commented Mar 7, 2026

Uh oh!

Samoed commented Mar 7, 2026

Uh oh!

isaac-chung commented Mar 7, 2026

Uh oh!

isaac-chung commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Samoed commented Mar 7, 2026

Uh oh!

KennethEnevoldsen commented Mar 7, 2026

Uh oh!

isaac-chung commented Mar 7, 2026

Uh oh!

KennethEnevoldsen commented Mar 7, 2026

Uh oh!

Uh oh!

KennethEnevoldsen commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Samoed left a comment •

edited

Loading

isaac-chung commented Mar 7, 2026 •

edited

Loading

KennethEnevoldsen commented Mar 11, 2026 •

edited

Loading