You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ℹ️ <em>See the [feature tracker](https://github.com/pytorch/ao/issues/556) and the [performance tracker](https://github.com/pytorch/ao/issues/1768) for upcoming features.</em>
18
14
15
+
## Training e2e benchmarks on NVIDIA B200
16
+
17
+
- Single-node training on 8xB200 GPUs limited to 750W, batch size 1, sequence length 8192, steps 100, `torch.compile`, FSDP2, per-op SAC
To reproduce these benchmarks, you can follow these steps:
30
+
31
+
1. On a machine with compatible GPUs, clone torchtitan and follow local installation [steps](https://github.com/pytorch/torchtitan?tab=readme-ov-file#installation),
32
+
including [downloading a tokenizer](https://github.com/pytorch/torchtitan?tab=readme-ov-file#downloading-a-tokenizer).
33
+
2. Install torchao following these [steps](https://github.com/pytorch/ao/tree/main?tab=readme-ov-file#installation).
34
+
3. From the `torchao/` directory, you can run the following commands to reproduce the benchmarks above:
0 commit comments