You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can stream converted weights from Megatron to HF without saving to disk. You can also use config-only loading for architecture exploration without loading weights:
202
+
You can stream converted weights from Megatron to HF without saving to disk:
203
203
204
204
```python
205
205
# ✅ Use streaming for large models
206
206
for name, weight in bridge.export_hf_weights(model, cpu=True):
207
207
process_weight(name, weight)
208
+
```
209
+
210
+
### 4. Use `from_hf_pretrained` for Export Workflows
211
+
212
+
When exporting Megatron checkpoints back to 🤗 Hugging Face format, always use `from_hf_pretrained()` instead of `from_hf_config()`. The `from_hf_config()` method does not load the tokenizer and other artifacts required for saving a complete 🤗 Hugging Face checkpoint:
208
213
209
-
# ✅ Use config-only loading for architecture exploration
# bridge = AutoBridge.from_hf_config(config) # Missing tokenizer, etc.
224
+
# bridge.export_ckpt(...) # Will fail!
214
225
```
215
226
227
+
The `from_hf_config()` method is only suitable for architecture exploration and introspection (e.g., inspecting `transformer_config`), not for checkpoint conversion workflows.
228
+
216
229
For more examples and advanced usage patterns, see the `examples/conversion/` directory in the repository.
0 commit comments