[convert] Support for DeepSeek-V3.2 and Dequantizing#641
[convert] Support for DeepSeek-V3.2 and Dequantizing#641brian-dellabetta wants to merge 87 commits intomainfrom
Conversation
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Updated README to accurately document the convert_checkpoint entrypoint, including the Converter system, ModelOptNvfp4Converter, and usage examples. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
|
The quality checks have failed. Please run |
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
|
The quality checks have failed. Please run |
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
|
The quality checks have failed. Please run |
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
kylesayrs
left a comment
There was a problem hiding this comment.
Outside of error for single-shard models, looks good to me! Thanks for being open to suggestions
| all_dependencies: set[str] = set() | ||
| for values in weight_deps_dict.values(): | ||
| for value in values: | ||
| all_dependencies.add(value) |
There was a problem hiding this comment.
| all_dependencies: set[str] = set() | |
| for values in weight_deps_dict.values(): | |
| for value in values: | |
| all_dependencies.add(value) | |
| all_dependencies: set[str] = set().union(*weight_deps_dict.values()) |
| """ | ||
| Given a weight name, return a dictionary of all dependency weight names, so that | ||
| weights can be processed correctly and in a parallelized fashion. | ||
| If a dependency is optional, the value associated with the key should be False. |
There was a problem hiding this comment.
Please give an example of an "optional dependency" to make the concept clear
| return current_deps | ||
|
|
||
| # map of weight name -> ( map of dependency name -> is_required ) | ||
| weight_deps_dict: dict[str, set[str]] = defaultdict(set) |
There was a problem hiding this comment.
| weight_deps_dict: dict[str, set[str]] = defaultdict(set) | |
| weight_deps_dict: dict[str, dict[str, bool]] = dict() |
| else: | ||
| continue | ||
| weight_to_add_shard_name = weight_map[weight_to_add_name] | ||
| resolved_path = model_files.get(weight_to_add_shard_name) |
There was a problem hiding this comment.
| resolved_path = model_files.get(weight_to_add_shard_name) | |
| resolved_path = model_files[weight_to_add_shard_name] |
| resolved_path = model_files.get(weight_to_add_shard_name) | ||
| inverse_weight_map[resolved_path].append(weight_to_add_name) | ||
|
|
||
| # return dicts, not defaultdicts, to avoid silent errors |
| # NOTE: sometimes models split weights across different files | ||
| logger.warning( |
There was a problem hiding this comment.
Shouldn't this be an error?
| disallowed_names = ["weight_scale_inv"] | ||
| untargeted_names = [ | ||
| name for name in tensors.keys() if name not in targeted_names | ||
| ] | ||
| for name in untargeted_names: | ||
| param_name = name.rsplit(".", 1)[-1] | ||
|
|
||
| if param_name in disallowed_names: | ||
| raise ValueError(f"Found unexpected non-targeted tensor {name}") |
There was a problem hiding this comment.
This makes sense for now, but is inflexible if we want to support mixed recipes. For example, convert some weights to full precision, but convert other weights from fp8 to compressed-tensors
| # Read weight map from safetensors.index.json | ||
| index_file = find_safetensors_index_file(model_files) | ||
| with open(index_file, "r") as f: | ||
| weight_map: dict[str, str] = json.load(f)["weight_map"] |
There was a problem hiding this comment.
This will error if there is no index file. Use something like this instead
def get_weight_map(model_files):
index_file = find_safetensors_index_file(model_files)
if index_file is not None:
with open(index_file, "r") as f:
return json.load(f)["weight_map"]
else:
with safe_open(SAFE_WEIGHTS_NAME, "r") as file:
return {tensor: SAFE_WEIGHTS_NAME for tensor in file.keys()}| config_data = json.load(file) | ||
|
|
||
| config_data[QUANTIZATION_CONFIG_NAME] = quant_config_data | ||
| if quant_config_data is None: |
There was a problem hiding this comment.
qconfig field is not guaranteed to exist
| if quant_config_data is None: | |
| if quant_config_data is None and QUANTIZATION_CONFIG_NAME in config_data: |
| :return: absolute path to the safetensors index file, or None if not found | ||
| """ | ||
| for file_path, resolved_path in model_files.items(): | ||
| if file_path.endswith("safetensors.index.json"): |
There was a problem hiding this comment.
| if file_path.endswith("safetensors.index.json"): | |
| if file_path.endswith(SAFE_WEIGHTS_INDEX_NAME): |
Corequisite:
This PR enhances the
convert_checkpointentrypoint to handle dequantization as well, and adds new functionality to be compatible with DeepSeek-V3.2:get_dependency_weightmethod to Converter interface, so converters can define if a weight has dependency weights that also need to be processed along with.build_inverse_weight_mapslogic, following pattern in llm-compressor, so that all weight dependencies can be loaded in with the given weight, even if they live in separate safetensors files. The deepseek model often splits a module'sweightandweight_scale_invtensors across different files, and they need to be processed together when dequantizing.create_quant_configsignature to return an Optional field. This is needed when converting from one checkpoint to bfloat16, when no accompanyingcompressed-tensorsquantization config is needed.Merge in conjunction with vllm-project/llm-compressor#2491
TODOs:
Test Plan: