feat: handle safetensors conversion for unshared incomplete tensors (#49)

tjohnson31415 · web-flow · commit 1ca49327d4a4 · 2024-03-08T09:15:32.000-08:00
#### Motivation The automatic safetensors conversion for Mistral should work, but it currently fails. Running ``` text-generation-server download-weights mistralai/Mistral-7B-v0.1 --revision=clarify-transformers-requirement ``` fails with ``` Traceback (most recent call last): File "/opt/tgis/bin/text-generation-server", line 8, in <module> sys.exit(app()) ^^^^^ File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/cli.py", line 104, in download_weights convert_to_safetensors(model_name, revision) File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/cli.py", line 192, in convert_to_safetensors utils.convert_files(local_pt_files, local_st_files, discard_names) File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/utils/convert.py", line 123, in convert_files convert_file(pt_file, sf_file, discard_names) File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/utils/convert.py", line 69, in convert_file to_removes = _remove_duplicate_names(loaded, discard_names=discard_names) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/tgis/lib/python3.11/site-packages/text_generation_server/utils/convert.py", line 33, in _remove_duplicate_names raise RuntimeError( RuntimeError: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.norm.weight'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue. ``` `torch.save` saves the underlying storage tensor but the safetensor format doesn't support shared tensors. However, the problematic `model.norm.weight` layer is not actually shared (no overlap with other tensors in the state_dict), but it is incomplete (does not fully cover the span of its underlying storage tensor). #### Modifications `_find_shared_tensors` groups tensor names with overlapping data and returns them as a list of sets. These sets will have a single element when the tensor data is not shared, so this PR just skips sets of size 1 when removing duplicates. #### Result We now support converting models that have unshared but incomplete tensors. #### Related Issues --------- Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
diff --git a/server/text_generation_server/utils/convert.py b/server/text_generation_server/utils/convert.py
@@ -26,6 +26,11 @@ def _remove_duplicate_names(
     shareds = _find_shared_tensors(state_dict)
     to_remove = defaultdict(list)
     for shared in shareds:
+        # _find_shared_tensors returns a list of sets of names of tensors that
+        # have the same data, including sets with one element that aren't shared
+        if len(shared) == 1:
+            continue
+
         complete_names = set(
             [name for name in shared if _is_complete(state_dict[name])]
         )