Skip to content

Commit 1ca4932

Browse files
feat: handle safetensors conversion for unshared incomplete tensors (#49)
#### Motivation The automatic safetensors conversion for Mistral should work, but it currently fails. Running ``` text-generation-server download-weights mistralai/Mistral-7B-v0.1 --revision=clarify-transformers-requirement ``` fails with ``` Traceback (most recent call last): File "/opt/tgis/bin/text-generation-server", line 8, in <module> sys.exit(app()) ^^^^^ File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/cli.py", line 104, in download_weights convert_to_safetensors(model_name, revision) File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/cli.py", line 192, in convert_to_safetensors utils.convert_files(local_pt_files, local_st_files, discard_names) File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/utils/convert.py", line 123, in convert_files convert_file(pt_file, sf_file, discard_names) File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/utils/convert.py", line 69, in convert_file to_removes = _remove_duplicate_names(loaded, discard_names=discard_names) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/utils/convert.py", line 33, in _remove_duplicate_names raise RuntimeError( RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.norm.weight'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue. ``` `torch.save` saves the underlying storage tensor but the safetensor format doesn't support shared tensors. However, the problematic `model.norm.weight` layer is not actually shared (no overlap with other tensors in the state_dict), but it is incomplete (does not fully cover the span of its underlying storage tensor). #### Modifications `_find_shared_tensors` groups tensor names with overlapping data and returns them as a list of sets. These sets will have a single element when the tensor data is not shared, so this PR just skips sets of size 1 when removing duplicates. #### Result We now support converting models that have unshared but incomplete tensors. #### Related Issues --------- Signed-off-by: Travis Johnson <[email protected]>
1 parent 381af70 commit 1ca4932

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

server/text_generation_server/utils/convert.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,11 @@ def _remove_duplicate_names(
2626
shareds = _find_shared_tensors(state_dict)
2727
to_remove = defaultdict(list)
2828
for shared in shareds:
29+
# _find_shared_tensors returns a list of sets of names of tensors that
30+
# have the same data, including sets with one element that aren't shared
31+
if len(shared) == 1:
32+
continue
33+
2934
complete_names = set(
3035
[name for name in shared if _is_complete(state_dict[name])]
3136
)

0 commit comments

Comments
 (0)