|
2 | 2 | Fabric (Beta) |
3 | 3 | ############# |
4 | 4 |
|
5 | | -Fabric allows you to scale any PyTorch model with just a few lines of code! |
6 | | -With Fabric, you can easily scale your model to run on distributed devices using the strategy of your choice while keeping complete control over the training loop and optimization logic. |
| 5 | +Fabric is the fast and lightweight way to scale PyTorch models without boilerplate code. |
7 | 6 |
|
8 | | -With only a few changes to your code, Fabric allows you to: |
9 | | - |
10 | | -- Automatic placement of models and data onto the device |
11 | | -- Automatic support for mixed precision (speedup and smaller memory footprint) |
12 | | -- Seamless switching between hardware (CPU, GPU, TPU) |
13 | | -- State-of-the-art distributed training strategies (DDP, FSDP, DeepSpeed) |
14 | | -- Easy-to-use launch command for spawning processes (DDP, torchelastic, etc) |
15 | | -- Multi-node support (TorchElastic, SLURM, and more) |
16 | | -- You keep complete control of your training loop |
| 7 | +- Easily switch from running on CPU to GPU (Apple Silicon, CUDA, ...), TPU, multi-GPU or even multi-node training |
| 8 | +- State-of-the-art distributed training strategies (DDP, FSDP, DeepSpeed) and mixed precision out of the box |
| 9 | +- Handles all the boilerplate device logic for you |
| 10 | +- Brings useful tools to help you build a trainer (callbacks, logging, checkpoints, ...) |
| 11 | +- Designed with multi-billion parameter models in mind |
17 | 12 |
|
| 13 | +| |
18 | 14 |
|
19 | 15 | .. code-block:: diff |
20 | 16 |
|
@@ -60,6 +56,32 @@ With only a few changes to your code, Fabric allows you to: |
60 | 56 | ---- |
61 | 57 |
|
62 | 58 |
|
| 59 | +*********** |
| 60 | +Why Fabric? |
| 61 | +*********** |
| 62 | + |
| 63 | +Fabric differentiates itself from a fully-fledged trainer like :doc:`Lightning Trainer <../common/trainer>` in these key aspects: |
| 64 | + |
| 65 | +**Fast to implement** |
| 66 | +There is no need to restructure your code: Just change a few lines in the PyTorch script and you'll be able to leverage Fabric features. |
| 67 | + |
| 68 | +**Maximum Flexibility** |
| 69 | +Write your own training and/or inference logic down to the individual optimizer calls. |
| 70 | +You aren't forced to conform to a standardized epoch-based training loop like the one in :doc:`Lightning Trainer <../common/trainer>`. |
| 71 | +You can do flexible iteration based training, meta-learning, cross-validation and other types of optimization algorithms without digging into framework internals. |
| 72 | +This also makes it super easy to adopt Fabric in existing PyTorch projects to speed-up and scale your models without the compromise on large refactors. |
| 73 | +Just remember: With great power comes a great responsibility. |
| 74 | + |
| 75 | +**Maximum Control** |
| 76 | +The :doc:`Lightning Trainer <../common/trainer>` has many built in features to make research simpler with less boilerplate, but debugging it requires some familiarity with the framework internals. |
| 77 | +In Fabric, everything is opt-in. Think of it as a toolbox: You take out the tools (Fabric functions) you need and leave the other ones behind. |
| 78 | +This makes it easier to develop and debug your PyTorch code as you gradually add more features to it. |
| 79 | +Fabric provides important tools to remove undesired boilerplate code (distributed, hardware, checkpoints, logging, ...), but leaves the design and orchestration fully up to you. |
| 80 | + |
| 81 | + |
| 82 | +---- |
| 83 | + |
| 84 | + |
63 | 85 | ************ |
64 | 86 | Fundamentals |
65 | 87 | ************ |
|
0 commit comments