Skip to content

Commit 7a4cc3e

Browse files
committed
fix
Signed-off-by: Hemil Desai <hemild@nvidia.com>
1 parent 5912e10 commit 7a4cc3e

File tree

1 file changed

+1
-13
lines changed

1 file changed

+1
-13
lines changed

nemo_automodel/components/models/deepseek_v32/state_dict_adapter.py

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -30,19 +30,7 @@
3030

3131

3232
class DeepSeekV32StateDictAdapter(DeepSeekV3StateDictAdapter):
33-
"""State dict adapter for DeepSeek V3.2.
34-
35-
Extends the V3 adapter with support for the new Indexer weights:
36-
- self_attn.indexer.wq_b.weight
37-
- self_attn.indexer.wk.weight
38-
- self_attn.indexer.k_norm.weight (LayerNorm)
39-
- self_attn.indexer.k_norm.bias (LayerNorm)
40-
- self_attn.indexer.weights_proj.weight
41-
42-
The indexer weights use the same naming convention between HF and native format,
43-
so no special key mapping is needed. The main difference is handling the
44-
k_norm LayerNorm which should not be quantized.
45-
"""
33+
"""State dict adapter for DeepSeek V3.2."""
4634

4735
# Base non-quantized keys from V3
4836
_base_non_quantized_keys = [

0 commit comments

Comments
 (0)