Skip to content

Commit 3558e62

Browse files
authored
Update ReadMe (#2531)
* Fix readme * Update readme
1 parent a5a6c50 commit 3558e62

File tree

1 file changed

+17
-20
lines changed

1 file changed

+17
-20
lines changed

README.md

Lines changed: 17 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
66
[![PyPI Status Badge](https://badge.fury.io/py/elasticdl-client.svg)](https://pypi.org/project/elasticdl-client/)
77

8-
ElasticDL is a Kubernetes-native deep learning framework built on top of
9-
TensorFlow 2.0 that supports fault-tolerance and elastic scheduling.
8+
ElasticDL is a Kubernetes-native deep learning framework
9+
that supports fault-tolerance and elastic scheduling.
1010

1111
## Main Features
1212

@@ -16,11 +16,11 @@ Through Kubernetes-native design, ElasticDL enables fault-tolerance and works
1616
with the priority-based preemption of Kubernetes to achieve elastic scheduling
1717
for deep learning tasks.
1818

19-
### TensorFlow 2.0 Eager Execution
19+
### Support TensorFlow and PyTorch
2020

21-
A distributed deep learning framework needs to know local gradients before the
22-
model update. Eager Execution allows ElasticDL to do it without hacking into the
23-
graph execution process.
21+
- TensorFlow Estimator.
22+
- TensorFlow Keras.
23+
- PyTorch
2424

2525
### Minimalism Interface
2626

@@ -37,30 +37,27 @@ elasticdl train \
3737
--volume="host_path=/data,mount_path=/data"
3838
```
3939

40-
### Integration with SQLFlow
41-
42-
ElasticDL will be integrated seamlessly with SQLFlow to connect SQL to
43-
distributed deep learning tasks with ElasticDL.
44-
45-
```sql
46-
SELECT * FROM employee LABEL income INTO my_elasticdl_model
47-
```
48-
4940
## Quick Start
5041

5142
Please check out our [step-by-step tutorial](docs/tutorials/get_started.md) for
5243
running ElasticDL on local laptop, on-prem cluster, or on public cloud such as
5344
Google Kubernetes Engine.
5445

46+
[TensorFlow Estimator on MiniKube](docs/tutorials/elasticdl_estimator.md)
47+
48+
[TensorFlow Keras on MiniKube](docs/tutorials/elasticdl_local.md)
49+
50+
[PyTorch on MiniKube](docs/tutorials/elasticdl_torch.md )
51+
5552
## Background
5653

57-
TensorFlow has its native distributed computing feature that is
54+
TensorFlow/PyTorch has its native distributed computing feature that is
5855
fault-recoverable. In the case that some processes fail, the distributed
5956
computing job would fail; however, we can restart the job and recover its status
6057
from the most recent checkpoint files.
6158

62-
ElasticDL, as an enhancement of TensorFlow's distributed training feature,
63-
supports fault-tolerance. In the case that some processes fail, the job would
59+
ElasticDL supports fault-tolerance during distributed training.
60+
In the case that some processes fail, the job would
6461
go on running. Therefore, ElasticDL doesn't need to save checkpoint nor recover
6562
from checkpoints.
6663

@@ -80,11 +77,11 @@ first job completes. In this case, the overall utilization is 100%.
8077

8178
The feature of elastic scheduling of ElasticDL comes from its Kubernetes-native
8279
design -- it doesn't rely on Kubernetes extensions like Kubeflow to run
83-
TensorFlow programs; instead, the master process of an ElasticDL job calls
80+
TensorFlow/PyTorch programs; instead, the master process of an ElasticDL job calls
8481
Kubernetes API to start workers and parameter servers; it also watches events
8582
like process/pod killing and reacts to such events to realize fault-tolerance.
8683

87-
In short, ElasticDL enhances TensorFlow with fault-tolerance and elastic
84+
In short, ElasticDL enhances TensorFlow/PyTorch with fault-tolerance and elastic
8885
scheduling in the case that you have a Kubernetes cluster. We provide a tutorial
8986
showing how to set up a Kubernetes cluster on Google Cloud and run ElasticDL
9087
jobs there. We respect TensorFlow's native distributed computing feature, which

0 commit comments

Comments
 (0)