Skip to content

Conversation

danbev
Copy link
Member

@danbev danbev commented Aug 6, 2025

This commit addresses an issue with the convert_hf_to_gguf script which is currently failing with:

AttributeError: module 'torch' has no attribute 'uint64'

This occurred because safetensors expects torch.uint64 to be available in the public API, but PyTorch 2.2.x only provides limited support for unsigned types beyond uint8 it seems. The torch.uint64 dtype exists but is not exposed in the standard torch namespace
(see pytorch/pytorch#58734).

PyTorch 2.4.0 properly exposes torch.uint64 in the public API, resolving the compatibility issue with safetensors. This also required torchvision to updated to =0.19.0 for compatibility.

Refs: https://huggingface.co/spaces/ggml-org/gguf-my-repo/discussions/186#68938de803e47d990aa087fb
Refs: pytorch/pytorch#58734


This can be reproduced on master and try to convert gpt-oss-mxfp4:

convert tool output
$ python3.11 -m venv new-venv
$ source new-venv/bin/activate
(new-venv) $ pip install -r requirements/requirements-all.txt
(new-venv) $ ./convert_hf_to_gguf.py ~/work/ai/models/gpt-oss-mxfp4/ --outfile models/gpt-oss-mxfp4.gguf --outtype f16
INFO:hf-to-gguf:Loading model: gpt-oss-mxfp4
WARNING:hf-to-gguf:Failed to load model config from /home/danbev/work/ai/models/gpt-oss-mxfp4: The checkpoint you are trying to load has model type `gpt_oss` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: GptOssForCausalLM
WARNING:hf-to-gguf:Failed to load model config from /home/danbev/work/ai/models/gpt-oss-mxfp4: The checkpoint you are trying to load has model type `gpt_oss` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
INFO:hf-to-gguf:Repacked blk.0.ffn_down_exps.weight with shape [32, 32, 64] and quantization MXFP4
INFO:hf-to-gguf:Repacked blk.0.ffn_gate_exps.weight with shape [32, 64, 32] and quantization MXFP4
INFO:hf-to-gguf:Repacked blk.0.ffn_up_exps.weight with shape [32, 64, 32] and quantization MXFP4
INFO:hf-to-gguf:Repacked blk.1.ffn_down_exps.weight with shape [32, 32, 64] and quantization MXFP4
INFO:hf-to-gguf:Repacked blk.1.ffn_gate_exps.weight with shape [32, 64, 32] and quantization MXFP4
INFO:hf-to-gguf:Repacked blk.1.ffn_up_exps.weight with shape [32, 64, 32] and quantization MXFP4
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
INFO:hf-to-gguf:token_embd.weight,                torch.bfloat16 --> F16, shape = {32, 201088}
INFO:hf-to-gguf:blk.0.attn_norm.weight,           torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.0.ffn_down_exps.bias,         torch.bfloat16 --> F32, shape = {32, 32}
INFO:hf-to-gguf:blk.0.ffn_gate_exps.bias,         torch.bfloat16 --> F32, shape = {64, 32}
INFO:hf-to-gguf:blk.0.ffn_up_exps.bias,           torch.bfloat16 --> F32, shape = {64, 32}
INFO:hf-to-gguf:blk.0.ffn_gate_inp.bias,          torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.0.ffn_gate_inp.weight,        torch.bfloat16 --> F32, shape = {32, 32}
INFO:hf-to-gguf:blk.0.post_attention_norm.weight, torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.0.attn_k.bias,                torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.0.attn_k.weight,              torch.bfloat16 --> F16, shape = {32, 32}
INFO:hf-to-gguf:blk.0.attn_output.bias,           torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.0.attn_output.weight,         torch.bfloat16 --> F16, shape = {64, 32}
INFO:hf-to-gguf:blk.0.attn_q.bias,                torch.bfloat16 --> F32, shape = {64}
INFO:hf-to-gguf:blk.0.attn_q.weight,              torch.bfloat16 --> F16, shape = {32, 64}
INFO:hf-to-gguf:blk.0.attn_sinks.weight,          torch.bfloat16 --> F32, shape = {2}
INFO:hf-to-gguf:blk.0.attn_v.bias,                torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.0.attn_v.weight,              torch.bfloat16 --> F16, shape = {32, 32}
INFO:hf-to-gguf:blk.1.attn_norm.weight,           torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.1.ffn_down_exps.bias,         torch.bfloat16 --> F32, shape = {32, 32}
INFO:hf-to-gguf:blk.1.ffn_gate_exps.bias,         torch.bfloat16 --> F32, shape = {64, 32}
INFO:hf-to-gguf:blk.1.ffn_up_exps.bias,           torch.bfloat16 --> F32, shape = {64, 32}
INFO:hf-to-gguf:blk.1.ffn_gate_inp.bias,          torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.1.ffn_gate_inp.weight,        torch.bfloat16 --> F32, shape = {32, 32}
INFO:hf-to-gguf:blk.1.post_attention_norm.weight, torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.1.attn_k.bias,                torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.1.attn_k.weight,              torch.bfloat16 --> F16, shape = {32, 32}
INFO:hf-to-gguf:blk.1.attn_output.bias,           torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.1.attn_output.weight,         torch.bfloat16 --> F16, shape = {64, 32}
INFO:hf-to-gguf:blk.1.attn_q.bias,                torch.bfloat16 --> F32, shape = {64}
INFO:hf-to-gguf:blk.1.attn_q.weight,              torch.bfloat16 --> F16, shape = {32, 64}
INFO:hf-to-gguf:blk.1.attn_sinks.weight,          torch.bfloat16 --> F32, shape = {2}
INFO:hf-to-gguf:blk.1.attn_v.bias,                torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:blk.1.attn_v.weight,              torch.bfloat16 --> F16, shape = {32, 32}
INFO:hf-to-gguf:output_norm.weight,               torch.bfloat16 --> F32, shape = {32}
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 131072
INFO:hf-to-gguf:gguf: embedding length = 32
INFO:hf-to-gguf:gguf: feed forward length = 64
INFO:hf-to-gguf:gguf: head count = 2
INFO:hf-to-gguf:gguf: key-value head count = 1
INFO:hf-to-gguf:gguf: rope theta = 150000
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: expert count = 32
INFO:hf-to-gguf:gguf: experts used count = 4
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1778, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/generation/utils.py", line 40, in <module>
    from ..pytorch_utils import isin_mps_friendly
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/pytorch_utils.py", line 21, in <module>
    from safetensors.torch import storage_ptr, storage_size
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/safetensors/torch.py", line 439, in <module>
    torch.uint64: 8,
    ^^^^^^^^^^^^
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/torch/__init__.py", line 1938, in __getattr__
    raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
AttributeError: module 'torch' has no attribute 'uint64'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1778, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 38, in <module>
    from .auto_factory import _LazyAutoMapping
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 40, in <module>
    from ...generation import GenerationMixin
  File "<frozen importlib._bootstrap>", line 1229, in _handle_fromlist
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1766, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1780, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
module 'torch' has no attribute 'uint64'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 8479, in <module>
    main()
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 8473, in main
    model_instance.write()
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 411, in write
    self.prepare_metadata(vocab_only=False)
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 524, in prepare_metadata
    self.set_vocab()
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 8051, in set_vocab
    self._set_vocab_gpt2()
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 887, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
                               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/danbev/work/ai/llama.cpp-debug/./convert_hf_to_gguf.py", line 609, in get_vocab_base
    from transformers import AutoTokenizer
  File "<frozen importlib._bootstrap>", line 1229, in _handle_fromlist
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1767, in __getattr__
    value = getattr(module, name)
            ^^^^^^^^^^^^^^^^^^^^^
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1766, in __getattr__
    module = self._get_module(self._class_to_module[name])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/danbev/work/ai/llama.cpp-debug/new-venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1780, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.models.auto.tokenization_auto because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
module 'torch' has no attribute 'uint64'

This commit addresses an issue with the convert_hf_to_gguf script
which is currently failing with:
```console
AttributeError: module 'torch' has no attribute 'uint64'
```

This occurred because safetensors expects torch.uint64 to be available
in the public API, but PyTorch 2.2.x only provides limited support for
unsigned types beyond uint8 it seems. The torch.uint64 dtype exists but
is not exposed in the standard torch namespace
(see pytorch/pytorch#58734).

PyTorch 2.4.0 properly exposes torch.uint64 in the public API, resolving
the compatibility issue with safetensors. This also required torchvision
to updated to =0.19.0 for compatibility.

Refs: https://huggingface.co/spaces/ggml-org/gguf-my-repo/discussions/186#68938de803e47d990aa087fb
Refs: pytorch/pytorch#58734
@github-actions github-actions bot added examples python python script changes labels Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants