Skip to content

Conversation

@danielhanchen
Copy link
Contributor

Llama4 Scout config.json changed RoPE scaling, so we need to remove the assert since it breaks on Llama 4

@danielhanchen
Copy link
Contributor Author

@ngxson :)

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@ngxson ngxson merged commit ec6c09d into ggml-org:master Apr 11, 2025
5 checks passed
@danielhanchen danielhanchen deleted the patch-2 branch April 11, 2025 07:57
@ngxson
Copy link
Collaborator

ngxson commented Apr 11, 2025

@ggerganov FYI, I write a small script here: https://gist.github.com/ngxson/6dec015080121d239caa668332fba3f8

It calculates the rope_factors using 2 methods: one is the current code in convert_hf_to_gguf and the other is the code copied from transformers. We verified that output from 2 methods are all matched up.

Below is the value: on the left, old rope_base (before this commit) and on the right, after that commit:

idx. before  after
0        1.0     1.0
1        1.0     1.0
2        1.0     1.0
3        1.0     1.0
4        1.0     1.0
5        1.0     1.0
6        1.0     1.0
7        1.0     1.0
8        1.0     1.0
9        1.0     1.0
10       1.0     1.0
11       1.0     1.0
12       1.0     1.0
13       1.0     1.0
14       1.0     1.0
15       1.0     1.0
16       1.0     1.0
17       1.0     1.0
18       1.0     1.0
19       1.0     1.0
20       1.0     1.0
21       1.0     1.0
22       1.0     1.0
23       1.0     1.0
24       1.0     1.0
25       1.0     1.0
26       1.0     1.0
27       1.0     1.0
28       1.0     1.0
29       1.0     1.0
30       1.0     1.0
31       1.0     1.0
32       1.0     1.0
33       1.0     1.0
34       1.0     1.0
35       1.0     16.0
36       1.0     16.0
37       1.0     16.0
38       1.0     16.0
39       1.0     16.0
40       1.0     16.0
41       1.143270492553711       16.0
42       1.3603488206863403      16.0
43       1.6280388832092285      16.0
44       1.9624499082565308      16.0
45       2.3870468139648438      16.0
46       2.9374001026153564      16.0
47       3.6701674461364746      16.0
48       4.681482315063477       16.0
49       6.148653984069824       16.0
50       8.0     16.0
51       8.0     16.0
52       8.0     16.0
53       8.0     16.0
54       8.0     16.0
55       8.0     16.0
56       8.0     16.0
57       8.0     16.0
58       8.0     16.0
59       8.0     16.0
60       8.0     16.0
61       8.0     16.0
62       8.0     16.0
63       8.0     16.0

For now, I have no idea now to set this after loading GGUF (as discussed via DM), but feel free to make suggestion!

@github-actions github-actions bot added the python python script changes label Apr 11, 2025
@ggerganov
Copy link
Member

ggerganov commented Apr 11, 2025

Thanks. AFAIU the upstream models have been updated with a new RoPE config which technically would require re-converting existing GGUF models. I don't think there is an elegant way to avoid this conversion and do it seamlessly so that old GGUFs work with the new rope factors. It seems it will always be some non-trivial hack that will remain in the codebase forever. So I think it is better to just recommend conversion/re-download of the models. We can put a notice in the README in hot topics?

@ngxson
Copy link
Collaborator

ngxson commented Apr 11, 2025

Yes sounds ok to me, if it's too hacky then let's not do it. I think most people will use quantization from @unslothai or @bartowski1182 anyway (which was and will be updated very quickly), so probably don't need to add a notice.

@bartowski1182
Copy link
Contributor

Yeah I'll let it sit another week to make sure there's nothing else breaking and throw the reconverted model up

Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Apr 11, 2025
Nexesenex pushed a commit to Nexesenex/croco.cpp that referenced this pull request Apr 12, 2025
colout pushed a commit to colout/llama.cpp that referenced this pull request Apr 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants