Skip to content

Commit 42cb7a2

Browse files
author
Yi Wang
committed
Update Docker build design doc to incorporate comments
1 parent db045ac commit 42cb7a2

File tree

5 files changed

+87
-78
lines changed

5 files changed

+87
-78
lines changed

paddle/scripts/docker/README.md

Lines changed: 87 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,21 @@
1-
We need to complete the initial draft https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/docker/README.md.
1+
# Building PaddlePaddle
22

3-
I am recording some ideas here, and we should file a PR later.
3+
## Goals
44

5-
## Current Status
5+
We want the building procedure generates Docker images, so we can run PaddlePaddle applications on Kubernetes clusters.
66

7-
Currently, we have four sets of Dockefiles:
7+
We want it generates .deb packages, so that enterprises without Docker support can run PaddlePaddle applications as well.
88

9-
1. Kubernetes examples:
9+
We want to minimize the size of generated Docker images and .deb packages so to ease the deployment cost.
1010

11-
```
12-
doc/howto/usage/k8s/src/Dockerfile -- based on released image but add start.sh
13-
doc/howto/usage/k8s/src/k8s_data/Dockerfile -- contains only get_data.sh
14-
doc/howto/usage/k8s/src/k8s_train/Dockerfile -- this duplicates with the first one.
15-
```
11+
We want to encapsulate building tools and dependencies in a *development* Docker image so to ease the tools installation for developers.
1612

17-
1. Generate .deb packages:
13+
We want developers can use whatever editing tools (emacs, vim, Eclipse, Jupyter Notebook), so the development Docker image contains only building tools, not editing tools, and developers are supposed to git clone source code into their development computers, instead of the container running the development Docker image.
1814

19-
```
20-
paddle/scripts/deb/build_scripts/Dockerfile -- significantly overlaps with the `docker` directory
21-
```
15+
We want the procedure and tools work also with testing, continuous integration, and releasing.
2216

23-
1. In the `docker` directory:
2417

25-
```
26-
paddle/scripts/docker/Dockerfile
27-
paddle/scripts/docker/Dockerfile.gpu
28-
```
29-
30-
1. Document building
31-
32-
```
33-
paddle/scripts/tools/build_docs/Dockerfile -- a subset of above two sets.
34-
```
35-
36-
## Goal
18+
## Docker Images
3719

3820
We want two Docker images for each version of PaddlePaddle:
3921

@@ -45,7 +27,9 @@ We want two Docker images for each version of PaddlePaddle:
4527
- release engineers -- use this to build the official release from certain branch/tag on Github.com.
4628
- document writers / Website developers -- Our documents are in the source repo in the form of .md/.rst files and comments in source code. We need tools to extract the information, typeset, and generate Web pages.
4729

48-
So the development image must contain not only source code building tools, but also documentation tools:
30+
Of course developers can install building tools on their development computers. But different version of PaddlePaddle might require different set/version of building tools. Also, it makes collaborative debugging eaiser if all developers use a unified development environment.
31+
32+
The development image should include the following tools:
4933

5034
- gcc/clang
5135
- nvcc
@@ -54,7 +38,7 @@ We want two Docker images for each version of PaddlePaddle:
5438
- woboq
5539
- sshd
5640

57-
where `sshd` makes it easy for developers to have multiple terminals connecting into the container.
41+
where `sshd` makes it easy for developers to have multiple terminals connecting into the container. `docker exec` works too, but if the container is running on a remote machine, it would be easier to ssh directly into the container than ssh to the box and run `docker exec`.
5842

5943
1. `paddle:<version>`
6044

@@ -65,82 +49,107 @@ We want two Docker images for each version of PaddlePaddle:
6549
- no-GPU/AVX `paddle:<version>`
6650
- no-GPU/no-AVX `paddle:<version>-noavx`
6751

68-
We'd like to give users choices of GPU and no-GPU, because the GPU version image is much larger than then the no-GPU version.
52+
We'd like to give users the choice between GPU and no-GPU, because the GPU version image is much larger than then the no-GPU version.
53+
54+
We'd like to give users the choice between AVX and no-AVX, because some cloud providers don't provide AVX-enabled VMs.
55+
56+
57+
## Development Environment
58+
59+
Here we describe how to use above two images. We start from considering our daily development environment.
60+
61+
Developers work on a computer, which is usually a laptop or desktop:
62+
63+
![](doc/paddle-development-environment.png)
64+
65+
or, they might rely on a more sophisticated box (like with GPUs):
66+
67+
![](doc/paddle-development-environment-gpu.png)
68+
69+
A basic principle is that source code lies on the development computer (host), so that editing tools like Eclipse can parse the source code and support auto-completion.
70+
6971

70-
We'd like to give users choices of AVX and no-AVX, because some cloud providers don't provide AVX-enabled VMs.
72+
## Usages
7173

72-
## Dockerfile
74+
### Build the Development Docker Image
7375

74-
To realize above goals, we need only one Dockerfile for the development image. We can put it in the root source directory.
76+
The following commands check out the source code on the development computer (host) and build the development image `paddle:dev`:
7577

76-
Let us go over our daily development procedure to show how developers can use this file.
78+
```bash
79+
git clone https://github.com/PaddlePaddle/Paddle paddle
80+
cd paddle
81+
docker build -t paddle:dev .
82+
```
7783

78-
1. Check out the source code
84+
The `docker build` command assumes that `Dockerfile` is in the root source tree. This is reasonable because this Dockerfile is this only on in our repo in this design.
7985

80-
```bash
81-
git clone https://github.com/PaddlePaddle/Paddle paddle
82-
```
8386

84-
1. Do something
87+
### Build PaddlePaddle from Source Code
8588

86-
```bash
87-
cd paddle
88-
git checkout -b my_work
89-
Edit some files
90-
```
89+
Given the development image `paddle:dev`, the following command builds PaddlePaddle from the source tree on the development computer (host):
9190

92-
1. Build/update the development image (if not yet)
91+
```bash
92+
docker run -v $PWD:/paddle -e "GPU=OFF" -e "AVX=ON" -e "TEST=ON" paddle:dev
93+
```
9394

94-
```bash
95-
docker build -t paddle:dev . # Suppose that the Dockerfile is in the root source directory.
96-
```
95+
This command mounts the source directory on the host into `/paddle` in the container, so the default entrypoint of `paddle:dev`, `build.sh`, would build the source code with possible local changes. When it writes to `/paddle/build` in the container, it actually writes to `$PWD/build` on the host.
9796

98-
1. Build the source code
97+
`build.sh` builds the following:
9998

100-
```bash
101-
docker run -v $PWD:/paddle -e "GPU=OFF" -e "AVX=ON" -e "TEST=ON" paddle:dev
102-
```
99+
- PaddlePaddle binaries,
100+
- `$PWD/build/paddle-<version>.deb` for production installation, and
101+
- `$PWD/build/Dockerfile`, which builds the production Docker image.
103102

104-
This command maps the source directory on the host into `/paddle` in the container.
105103

106-
Please be aware that the default entrypoint of `paddle:dev` is a shell script file `build.sh`, which builds the source code, and outputs to `/paddle/build` in the container, which is actually `$PWD/build` on the host.
104+
### Build the Production Docker Image
107105

108-
`build.sh` doesn't only build binaries, but also generates a `$PWD/build/Dockerfile` file, which can be used to build the production image. We will talk about it later.
106+
The following command builds the production image:
109107

110-
1. Run on the host (Not recommended)
108+
```bash
109+
docker build -t paddle -f build/Dockerfile .
110+
```
111111

112-
If the host computer happens to have all dependent libraries and Python runtimes installed, we can now run/test the built program. But the recommended way is to running in a production image.
112+
This production image is minimal -- it includes binary `paddle`, the share library `libpaddle.so`, and Python runtime.
113113

114-
1. Run in the development container
114+
### Run PaddlePaddle Applications
115115

116-
`build.sh` generates binary files and invokes `make install`. So we can run the built program within the development container. This is convenient for developers.
116+
Again the development happens on the host. Suppoose that we have a simple application program in `a.py`, we can test and run it using the production image:
117117

118-
1. Build a production image
118+
```bash
119+
docker run -it -v $PWD:/work paddle /work/a.py
120+
```
119121

120-
On the host, we can use the `$PWD/build/Dockerfile` to generate a production image.
122+
But this works only if all dependencies of `a.py` are in the production image. If this is not the case, we need to build a new Docker image from the production image and with more dependencies installs.
121123

122-
```bash
123-
docker build -t paddle --build-arg "BOOK=ON" -f build/Dockerfile .
124-
```
124+
### Build and Run PaddlePaddle Appications
125125

126-
1. Run the Paddle Book
126+
We need a Dockerfile in https://github.com/paddlepaddle/book that builds Docker image `paddlepaddle/book:<version>`, basing on the PaddlePaddle production image:
127127

128-
Once we have the production image, we can run [Paddle Book](http://book.paddlepaddle.org/) chapters in Jupyter Notebooks (if we chose to build them)
128+
```
129+
FROM paddlepaddle/paddle:<version>
130+
RUN pip install -U matplotlib jupyter ...
131+
COPY . /book
132+
EXPOSE 8080
133+
CMD ["jupyter"]
134+
```
129135

130-
```bash
131-
docker run -it paddle
132-
```
136+
The book image is an example of PaddlePaddle application image. We can build it
133137

134-
Note that the default entrypoint of the production image starts Jupyter server, if we chose to build Paddle Book.
138+
```bash
139+
git clone https://github.com/paddlepaddle/book
140+
cd book
141+
docker build -t book .
142+
```
135143

136-
1. Run on Kubernetes
144+
### Build and Run Distributed Applications
137145

138-
We can push the production image to a DockerHub server, so developers can run distributed training jobs on the Kuberentes cluster:
146+
In our [API design doc](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/api.md#distributed-training), we proposed an API that starts a distributed training job on a cluster. This API need to build a PaddlePaddle application into a Docekr image as above, and calls kubectl to run it on the cluster. This API might need to generate a Dockerfile look like above and call `docker build`.
139147

140-
```bash
141-
docker tag paddle me/paddle
142-
docker push
143-
kubectl ...
144-
```
148+
Of course, we can manually build an application image and launch the job using the kubectl tool:
145149

146-
For end users, we will provide more convinient tools to run distributed jobs.
150+
```bash
151+
docker build -f some/Dockerfile -t myapp .
152+
docker tag myapp me/myapp
153+
docker push
154+
kubectl ...
155+
```
2.84 KB
Binary file not shown.
89.7 KB
Loading
2.63 KB
Binary file not shown.
65.7 KB
Loading

0 commit comments

Comments
 (0)