File tree Expand file tree Collapse file tree 1 file changed +1
-13
lines changed
nemo_automodel/components/models/deepseek_v32 Expand file tree Collapse file tree 1 file changed +1
-13
lines changed Original file line number Diff line number Diff line change 3030
3131
3232class DeepSeekV32StateDictAdapter (DeepSeekV3StateDictAdapter ):
33- """State dict adapter for DeepSeek V3.2.
34-
35- Extends the V3 adapter with support for the new Indexer weights:
36- - self_attn.indexer.wq_b.weight
37- - self_attn.indexer.wk.weight
38- - self_attn.indexer.k_norm.weight (LayerNorm)
39- - self_attn.indexer.k_norm.bias (LayerNorm)
40- - self_attn.indexer.weights_proj.weight
41-
42- The indexer weights use the same naming convention between HF and native format,
43- so no special key mapping is needed. The main difference is handling the
44- k_norm LayerNorm which should not be quantized.
45- """
33+ """State dict adapter for DeepSeek V3.2."""
4634
4735 # Base non-quantized keys from V3
4836 _base_non_quantized_keys = [
You can’t perform that action at this time.
0 commit comments