We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent af8c728 commit b382747Copy full SHA for b382747
doc/fluid/design/dist_train/large_model.md
@@ -11,6 +11,10 @@ the gradient to Parameter Server to execute the optimize program.
11
12
## Design
13
14
+**NOTE**: this approach is a feature of Fluid distributed trianing, maybe you want
15
+to know [Distributed Architecture](./distributed_architecture.md) and
16
+[Parameter Server](./parameter_server.md) before reading the following content.
17
+
18
Fluid large model distributed training use
19
[Distributed Transpiler](./parameter_server.md#distributed-transpiler) to split
20
a large parameter into multiple parameters which stored on Parameter Server, and
0 commit comments