Skip to content

Conversation

danbev
Copy link
Member

@danbev danbev commented Sep 8, 2025

This commit modifies the set_gguf_parameters method for EmbeddingGemma so that it reads the sliding_window parameter from the original model config.json and uses that value.

The motivation for this change is that the Gemma3TextConfig constructor adjusts the sliding_window value, which can lead to inconsistencies when converting models as we expects this value to match the original model's configuration.

Refs: https://github.com/huggingface/transformers/blob/bb45d3631ec7026db04a77d33a52b31766372160/src/transformers/models/gemma3/configuration_gemma3.py#L230

This commit modifies the set_gguf_parameters method for EmbeddingGemma
so that it reads the sliding_window parameter from the original model
config.json and uses that value.

The motivation for this change is that the Gemma3TextConfig
constructor adjusts the sliding_window value, which can lead to
inconsistencies when converting models as we expects this value to
match the original model's configuration.

Refs: https://github.com/huggingface/transformers/blob/bb45d3631ec7026db04a77d33a52b31766372160/src/transformers/models/gemma3/configuration_gemma3.py#L230
@github-actions github-actions bot added the python python script changes label Sep 8, 2025
@danbev danbev merged commit 233d773 into ggml-org:master Sep 8, 2025
7 checks passed
@danbev danbev deleted the embeddinggemma-sliding-window-conversion branch September 9, 2025 04:12
njsyw1997 pushed a commit to aizip/llama.cpp that referenced this pull request Sep 10, 2025
…#15867)

* convert : force setting sliding_window from original config

This commit modifies the set_gguf_parameters method for EmbeddingGemma
so that it reads the sliding_window parameter from the original model
config.json and uses that value.

The motivation for this change is that the Gemma3TextConfig
constructor adjusts the sliding_window value, which can lead to
inconsistencies when converting models as we expects this value to
match the original model's configuration.

Refs: https://github.com/huggingface/transformers/blob/bb45d3631ec7026db04a77d33a52b31766372160/src/transformers/models/gemma3/configuration_gemma3.py#L230

* fix flake8 error

* add link to huggingface PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants