Skip to content

[convert] Support for DeepSeek-V3.2 and Dequantizing#641

Open
brian-dellabetta wants to merge 87 commits intomainfrom
bdellabe/ds32-to-bfloat16
Open

[convert] Support for DeepSeek-V3.2 and Dequantizing#641
brian-dellabetta wants to merge 87 commits intomainfrom
bdellabe/ds32-to-bfloat16

Conversation

@brian-dellabetta
Copy link
Copy Markdown
Collaborator

@brian-dellabetta brian-dellabetta commented Mar 19, 2026

Corequisite:

This PR enhances the convert_checkpoint entrypoint to handle dequantization as well, and adds new functionality to be compatible with DeepSeek-V3.2:

  • Add a FP8Dequantizer` converter to upconvert checkpoints in quant method fp8 and quant scheme FP8_BLOCK back to bfloat16
  • Add get_dependency_weight method to Converter interface, so converters can define if a weight has dependency weights that also need to be processed along with.
  • Add build_inverse_weight_maps logic, following pattern in llm-compressor, so that all weight dependencies can be loaded in with the given weight, even if they live in separate safetensors files. The deepseek model often splits a module's weight and weight_scale_inv tensors across different files, and they need to be processed together when dequantizing.
  • Update converter create_quant_config signature to return an Optional field. This is needed when converting from one checkpoint to bfloat16, when no accompanying compressed-tensors quantization config is needed.

Merge in conjunction with vllm-project/llm-compressor#2491

TODOs:

Test Plan:

brian-dellabetta and others added 30 commits February 26, 2026 22:36
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Updated README to accurately document the convert_checkpoint entrypoint,
including the Converter system, ModelOptNvfp4Converter, and usage examples.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@mergify
Copy link
Copy Markdown

mergify bot commented Apr 1, 2026

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages.

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@mergify mergify bot removed the quality-failed label Apr 1, 2026
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@mergify
Copy link
Copy Markdown

mergify bot commented Apr 2, 2026

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages.

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@mergify mergify bot removed the quality-failed label Apr 2, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Apr 2, 2026

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages.

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@mergify mergify bot removed the quality-failed label Apr 2, 2026
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
@brian-dellabetta brian-dellabetta marked this pull request as ready for review April 2, 2026 22:50
@brian-dellabetta brian-dellabetta changed the title [convert] Support for DeepSeek-V3.2 [convert] Support for DeepSeek-V3.2 and Dequantizing Apr 2, 2026
Copy link
Copy Markdown
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of error for single-shard models, looks good to me! Thanks for being open to suggestions

Comment on lines +112 to +115
all_dependencies: set[str] = set()
for values in weight_deps_dict.values():
for value in values:
all_dependencies.add(value)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
all_dependencies: set[str] = set()
for values in weight_deps_dict.values():
for value in values:
all_dependencies.add(value)
all_dependencies: set[str] = set().union(*weight_deps_dict.values())

"""
Given a weight name, return a dictionary of all dependency weight names, so that
weights can be processed correctly and in a parallelized fashion.
If a dependency is optional, the value associated with the key should be False.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give an example of an "optional dependency" to make the concept clear

return current_deps

# map of weight name -> ( map of dependency name -> is_required )
weight_deps_dict: dict[str, set[str]] = defaultdict(set)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weight_deps_dict: dict[str, set[str]] = defaultdict(set)
weight_deps_dict: dict[str, dict[str, bool]] = dict()

else:
continue
weight_to_add_shard_name = weight_map[weight_to_add_name]
resolved_path = model_files.get(weight_to_add_shard_name)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resolved_path = model_files.get(weight_to_add_shard_name)
resolved_path = model_files[weight_to_add_shard_name]

resolved_path = model_files.get(weight_to_add_shard_name)
inverse_weight_map[resolved_path].append(weight_to_add_name)

# return dicts, not defaultdicts, to avoid silent errors
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty

Comment on lines +72 to +73
# NOTE: sometimes models split weights across different files
logger.warning(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an error?

Comment on lines +85 to +93
disallowed_names = ["weight_scale_inv"]
untargeted_names = [
name for name in tensors.keys() if name not in targeted_names
]
for name in untargeted_names:
param_name = name.rsplit(".", 1)[-1]

if param_name in disallowed_names:
raise ValueError(f"Found unexpected non-targeted tensor {name}")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense for now, but is inflexible if we want to support mixed recipes. For example, convert some weights to full precision, but convert other weights from fp8 to compressed-tensors

# Read weight map from safetensors.index.json
index_file = find_safetensors_index_file(model_files)
with open(index_file, "r") as f:
weight_map: dict[str, str] = json.load(f)["weight_map"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will error if there is no index file. Use something like this instead

def get_weight_map(model_files):
  index_file = find_safetensors_index_file(model_files)
  if index_file is not None:
    with open(index_file, "r") as f:
        return json.load(f)["weight_map"]
  else:
    with safe_open(SAFE_WEIGHTS_NAME, "r") as file:
      return {tensor: SAFE_WEIGHTS_NAME for tensor in file.keys()}

config_data = json.load(file)

config_data[QUANTIZATION_CONFIG_NAME] = quant_config_data
if quant_config_data is None:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qconfig field is not guaranteed to exist

Suggested change
if quant_config_data is None:
if quant_config_data is None and QUANTIZATION_CONFIG_NAME in config_data:

:return: absolute path to the safetensors index file, or None if not found
"""
for file_path, resolved_path in model_files.items():
if file_path.endswith("safetensors.index.json"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if file_path.endswith("safetensors.index.json"):
if file_path.endswith(SAFE_WEIGHTS_INDEX_NAME):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants