Skip to content

Commit 15fcdaf

Browse files
Polish README.md (#2138)
* Fix grammar errors. * Update the train command sample in README.md * Update argument values.
1 parent cfbfcf2 commit 15fcdaf

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

README.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,12 @@ with Keras API, train the model distributedly with a command line.
2929

3030
```bash
3131
elasticdl train \
32-
--model_def=mnist_functional_api.custom_model \
33-
--training_data=/mnist/train --output=output
32+
--image_name=elasticdl:mnist \
33+
--model_zoo=model_zoo \
34+
--model_def=mnist_functional_api.mnist_functional_api.custom_model \
35+
--training_data=/data/mnist/train \
36+
--job_name=test-mnist \
37+
--volume="host_path=/data,mount_path=/data"
3438
```
3539

3640
### Integration with SQLFlow
@@ -56,9 +60,9 @@ computing job would fail; however, we can restart the job and recover its status
5660
from the most recent checkpoint files.
5761

5862
ElasticDL, as an enhancement of TensorFlow's distributed training feature,
59-
supports fault-tolerance. In the case that some processes fail, the job would go
60-
on running. Therefore, ElasticDL doesn't need to checkpoint nor recover from
61-
checkpoints.
63+
supports fault-tolerance. In the case that some processes fail, the job would
64+
go on running. Therefore, ElasticDL doesn't need to save checkpoint nor recover
65+
from checkpoints.
6266

6367
The feature of fault-tolerance makes ElasticDL works with the priority-based
6468
preemption of Kubernetes to achieve elastic scheduling. When Kubernetes kills

0 commit comments

Comments
 (0)