Skip to content

Commit 26ca019

Browse files
committed
comments
Signed-off-by: Ananth Subramaniam <[email protected]>
1 parent 7e31023 commit 26ca019

File tree

1 file changed

+5
-13
lines changed

1 file changed

+5
-13
lines changed

docs/training/callbacks.md

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,6 @@
11
# Callbacks
22

3-
Megatron Bridge provides a lightweight callback system for injecting custom logic into the training and evaluation loop without modifying framework code. This is ideal for:
4-
5-
- Proprietary integrations
6-
- Custom logging and metrics tracking
7-
- Company-specific monitoring
8-
- Infrastructure heartbeats
9-
- Experiment tracking
3+
Megatron Bridge provides a lightweight callback system for injecting custom logic into the training and evaluation loop without modifying framework code. This is ideal for propietary integrations or custom logging and metrics tracking.
104

115
## Quick Start
126

@@ -48,10 +42,10 @@ def log_step(context):
4842
if context.loss_dict:
4943
print(f"Step {step}: {context.loss_dict}")
5044

51-
manager = CallbackManager()
52-
manager.register("on_train_step_end", log_step)
45+
callback_manager = CallbackManager()
46+
callback_manager.register("on_train_step_end", log_step)
5347

54-
pretrain(config, forward_step_func, callbacks=manager)
48+
pretrain(config, forward_step_func, callbacks=callback_manager)
5549
```
5650

5751
### Mixing Both Patterns
@@ -136,7 +130,7 @@ class StepCounterCallback(Callback):
136130

137131
## Distributed Training
138132

139-
Callbacks fire on **all ranks** without framework-level synchronization. If your callback should only run on specific ranks, add rank guards:
133+
Callbacks fire on **all ranks** without framework-level synchronization. If your callback should only run on specific ranks, add guards:
140134

141135
```python
142136
import torch.distributed as dist
@@ -199,8 +193,6 @@ The callback system follows these principles:
199193

200194
3. **Safety**: Callbacks receive framework state but modifying it is at the user's own risk. The framework makes no guarantees about the effects of modifications.
201195

202-
4. **Simplicity**: No priority ordering, no control flow back to the training loop, no framework-level exception handling. Callbacks are purely additive.
203-
204196
## Examples
205197

206198
### Proprietary Metrics

0 commit comments

Comments
 (0)