Skip to content

Commit 7a50d2e

Browse files
authored
[doc] fix: Clarify from_hf_config limitation for export workflows (#2113)
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
1 parent b78d9cd commit 7a50d2e

File tree

1 file changed

+20
-7
lines changed

1 file changed

+20
-7
lines changed

docs/bridge-guide.md

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -199,20 +199,33 @@ model = bridge.to_megatron_model() # Uses default settings
199199
```
200200

201201
### 3. Leverage the Parameter Streaming API
202-
You can stream converted weights from Megatron to HF without saving to disk. You can also use config-only loading for architecture exploration without loading weights:
202+
You can stream converted weights from Megatron to HF without saving to disk:
203203

204204
```python
205205
# ✅ Use streaming for large models
206206
for name, weight in bridge.export_hf_weights(model, cpu=True):
207207
process_weight(name, weight)
208+
```
209+
210+
### 4. Use `from_hf_pretrained` for Export Workflows
211+
212+
When exporting Megatron checkpoints back to 🤗 Hugging Face format, always use `from_hf_pretrained()` instead of `from_hf_config()`. The `from_hf_config()` method does not load the tokenizer and other artifacts required for saving a complete 🤗 Hugging Face checkpoint:
208213

209-
# ✅ Use config-only loading for architecture exploration
210-
config = AutoConfig.from_pretrained("meta-llama/Llama-3-8B")
211-
bridge = AutoBridge.from_hf_config(config)
212-
transformer_config = bridge.transformer_config
213-
print(f"Hidden size: {transformer_config.hidden_size}")
214+
```python
215+
from megatron.bridge import AutoBridge
216+
217+
# ✅ Correct: Use from_hf_pretrained for export workflows
218+
bridge = AutoBridge.from_hf_pretrained("meta-llama/Llama-3.2-1B")
219+
bridge.export_ckpt("./megatron_checkpoints/llama32_1b", "./hf_exports/llama32_1b")
220+
221+
# ❌ Avoid: from_hf_config lacks artifacts needed for saving
222+
# config = AutoConfig.from_pretrained("meta-llama/Llama-3.2-1B")
223+
# bridge = AutoBridge.from_hf_config(config) # Missing tokenizer, etc.
224+
# bridge.export_ckpt(...) # Will fail!
214225
```
215226

227+
The `from_hf_config()` method is only suitable for architecture exploration and introspection (e.g., inspecting `transformer_config`), not for checkpoint conversion workflows.
228+
216229
For more examples and advanced usage patterns, see the `examples/conversion/` directory in the repository.
217230

218231
## Convenience Workflows (Commands)
@@ -229,7 +242,7 @@ python -c "from megatron.bridge import AutoBridge; AutoBridge.import_ckpt('meta-
229242
### Megatron → HF export (one call)
230243

231244
```bash
232-
python -c "from megatron.bridge import AutoBridge; from transformers import AutoConfig; cfg=AutoConfig.from_pretrained('meta-llama/Llama-3.2-1B'); b=AutoBridge.from_hf_config(cfg); b.export_ckpt('./megatron_checkpoints/llama32_1b','./hf_exports/llama32_1b')"
245+
python -c "from megatron.bridge import AutoBridge; b=AutoBridge.from_hf_pretrained('meta-llama/Llama-3.2-1B'); b.export_ckpt('./megatron_checkpoints/llama32_1b','./hf_exports/llama32_1b')"
233246
```
234247

235248
### Create Megatron models and run locally

0 commit comments

Comments
 (0)