Skip to content

How to fine tune quantized modeling convert it to GGUF still been quantized? #1382

@kudla

Description

@kudla

Hi guys
I've succeeded to

  • fine tune (train) a model from huggingface
  • fuse the model with adapters
  • convert result with llama.cpp to gguf to be using with ollama

This works fine.
The only point here is that the source model is a quantized one. And according to the example I've followed the fuse step uses mlx_lm.fuse --de-quantize option. So the final model is a pretty huge in size.

I was trying to omit dequantization. But in this case llama.cpp convert_hf_to_gguf.py convert step fails with

INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00003.safetensors'
Traceback (most recent call last):
  File "/workspace/llama.cpp/convert_hf_to_gguf.py", line 8595, in <module>
    main()
  File "/workspace/llama.cpp/convert_hf_to_gguf.py", line 8589, in main
    model_instance.write()
  File "/workspace/llama.cpp/convert_hf_to_gguf.py", line 410, in write
    self.prepare_tensors()
  File "/workspace/llama.cpp/convert_hf_to_gguf.py", line 277, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
  File "/workspace/llama.cpp/convert_hf_to_gguf.py", line 4969, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
  File "/workspace/llama.cpp/convert_hf_to_gguf.py", line 236, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'model.embed_tokens.biases'

So how actually do the same but to keep the model being quantized?

Or should instead of keeping the source model being quantized just the result huge gguf model be quantized over for that purposes?

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions