Skip to content

Conversation

i3hz
Copy link
Contributor

@i3hz i3hz commented Oct 6, 2025

What does this PR do?

This PR fixes a numerical regression bug in the vision positional embedding calculation that was introduced between transformers versions v4.54.1 and v4.55.1. The original change was made to improve exportability but resulted in a slight floating point difference.
The fix was applied to the base modular files (modular_idefics2.py and modular_idefics3.py) and then propagated to all dependent models via make fix-copies.

Fixes #41190

Before submitting

Who can review

@zucchini-nlp

Copy link
Contributor

github-actions bot commented Oct 6, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: idefics2, idefics3, smolvlm

@zucchini-nlp
Copy link
Member

run-slow: idefics2, idefics3, smolvlm

Copy link
Contributor

github-actions bot commented Oct 7, 2025

This comment contains run-slow, running the specified jobs:

models: ['models/idefics2', 'models/idefics3', 'models/smolvlm']
quantizations: [] ...

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@i3hz
Copy link
Contributor Author

i3hz commented Oct 7, 2025

-slow: idefics2, idefics3

I'm struggling a bit to understand where exactly the tests have failed . I can see a test_model_parallel_beam_search test has failed .
Is there anything else that has failed ? And any pointers on how I can work on fix that would be really appreciated.
Thank you @zucchini-nlp for looking into the PR

@i3hz
Copy link
Contributor Author

i3hz commented Oct 7, 2025

On my local machine after running python3 -m pytest -v -rsfE --make-reports=multi-gpu_run_models_gpu_models/smolvlm_test_reports tests/models/smolvlm

I got the result :
244 passed, 126 skipped, 18 warnings in 111.25s (0:01:51)

I don't have a multi gpu system so I'm not sure if that's why I have more skips

@zucchini-nlp
Copy link
Member

@i3hz the only failing test is tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationIntegrationTest::test_integration_test_video which seems to be unrelated

Btw for running locally you also need to set RUN_SLOW=1 which runs slow tests as well. Can you check that the export test passes, then we can merge

tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationIntegrationTest::test_export_smolvlm_connector
tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationIntegrationTest::test_export_smolvlm_text_decoder
tests/models/smolvlm/test_modeling_smolvlm.py::SmolVLMForConditionalGenerationIntegrationTest::test_export_smolvlm_vision_encoder

@i3hz
Copy link
Contributor Author

i3hz commented Oct 7, 2025

@zucchini-nlp I ran the RUN_SLOW=1 pytest tests/models/smolvlm/test_modeling_smolvlm.py command .

With these results
8 failed, 176 passed, 93 skipped, 11 warnings in 182.77s (0:03:02)

The good news is that all three export tests passed successfully-

  • test_export_smolvlm_connector
  • test_export_smolvlm_text_decoder
  • test_export_smolvlm_vision_encoder

As for the failures -

  • test_integration_test_video
  • The other 7 are regarding FlashAttention . They all failed with the same error: RuntimeError: cu_seqlens_q must have shape (batch_size + 1). I could be wrong but I think these are unrelated .

Let me know what you think .
Thanks

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for verifying! Looks good to me, let's merge

@zucchini-nlp zucchini-nlp merged commit 4763b8c into huggingface:main Oct 7, 2025
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression in SmolVLM results in different vision embeddings
3 participants