Skip to content

Commit 7e03cba

Browse files
authored
Merge pull request #1161 from helinwang/k8s_aws
clarify and fix problems in paddle on aws k8s (create cluster part)
2 parents b000386 + 50afa35 commit 7e03cba

File tree

1 file changed

+92
-46
lines changed

1 file changed

+92
-46
lines changed

doc/howto/usage/k8s/k8s_aws_en.md

Lines changed: 92 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,16 @@
22

33
## Create AWS Account and IAM Account
44

5-
AWS account allow us to manage AWS from Web Console. Amazon AMI enable us to manage AWS from command line interface.
6-
7-
We need to create an AMI user with sufficient privilege to create kubernetes cluster on AWS.
5+
Under each AWS account, we can create multiple [IAM](http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) users. This allows us to grant some privileges to each IAM user and to create/operate AWS clusters as an IAM user.
86

97
To sign up an AWS account, please
108
follow
119
[this guide](http://docs.aws.amazon.com/lambda/latest/dg/setting-up.html).
12-
To create users and user groups under an AWS account, please
10+
To create IAM users and user groups under an AWS account, please
1311
follow
1412
[this guide](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html).
1513

16-
Please be aware that this tutorial needs the following privileges for the user in AMI:
14+
Please be aware that this tutorial needs the following privileges for the user in IAM:
1715

1816
- AmazonEC2FullAccess
1917
- AmazonS3FullAccess
@@ -27,14 +25,6 @@ Please be aware that this tutorial needs the following privileges for the user i
2725
- AWSKeyManagementServicePowerUser
2826

2927

30-
By the time we write this tutorial, we noticed that Chinese AWS users
31-
might suffer from authentication problems when running this tutorial.
32-
Our solution is that we create a VM instance with the default Amazon
33-
AMI and in the same zone as our cluster runs, so we can SSH to this VM
34-
instance as a tunneling server and control our cluster and jobs from
35-
it.
36-
37-
3828
## PaddlePaddle on AWS
3929

4030
Here we will show you step by step on how to run PaddlePaddle training on AWS cluster.
@@ -59,7 +49,7 @@ gpg2 --fingerprint FC8A365E
5949
```
6050
The correct key fingerprint is `18AD 5014 C99E F7E3 BA5F 6CE9 50BD D3E0 FC8A 365E`
6151

62-
Go to the [releases](https://github.com/coreos/kube-aws/releases) and download the latest release tarball and detached signature (.sig) for your architecture.
52+
We can download `kube-aws` from its [release page](https://github.com/coreos/kube-aws/releases). In this tutorial, we use version 0.9.1
6353

6454
Validate the tarball's GPG signature:
6555

@@ -88,14 +78,22 @@ mv ${PLATFORM}/kube-aws /usr/local/bin
8878

8979
[kubectl](https://kubernetes.io/docs/user-guide/kubectl-overview/) is a command line interface for running commands against Kubernetes clusters.
9080

91-
Go to the [releases](https://github.com/kubernetes/kubernetes/releases) and download the latest release tarball.
92-
93-
Extract the tarball and then concate the kubernetes binaries directory into PATH:
81+
Download `kubectl` from the Kubernetes release artifact site with the `curl` tool.
9482

9583
```
96-
export PATH=<path/to/kubernetes-directory>/platforms/linux/amd64:$PATH # The exact path depend on your platform
84+
# OS X
85+
curl -O https://storage.googleapis.com/kubernetes-release/release/"$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)"/bin/darwin/amd64/kubectl
86+
87+
# Linux
88+
curl -O https://storage.googleapis.com/kubernetes-release/release/"$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)"/bin/linux/amd64/kubectl
9789
```
9890

91+
Make the kubectl binary executable and move it to your PATH (e.g. `/usr/local/bin`):
92+
93+
```
94+
chmod +x ./kubectl
95+
sudo mv ./kubectl /usr/local/bin/kubectl
96+
```
9997

10098
### Configure AWS Credentials
10199

@@ -109,17 +107,18 @@ aws configure
109107
```
110108

111109

112-
Fill in the required fields (You can get your AWS aceess key id and AWS secrete access key by following [this](http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html) instruction):
110+
Fill in the required fields:
113111

114112

115113
```
116114
AWS Access Key ID: YOUR_ACCESS_KEY_ID
117115
AWS Secrete Access Key: YOUR_SECRETE_ACCESS_KEY
118-
Default region name: us-west-2
116+
Default region name: us-west-1
119117
Default output format: json
120-
121118
```
122119

120+
`YOUR_ACCESS_KEY_ID`, and `YOUR_SECRETE_ACCESS_KEY` is the IAM key and secret from [Create AWS Account and IAM Account](#create-aws-account-and-iam-account)
121+
123122
Verify that your credentials work by describing any instances you may already have running on your account:
124123

125124
```
@@ -134,7 +133,9 @@ The keypair that will authenticate SSH access to your EC2 instances. The public
134133

135134
Follow [EC2 Keypair docs](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html) to create a EC2 key pair
136135

137-
After creating a key pair, you will use the name you gave the keys to configure the cluster. Key pairs are only available to EC2 instances in the same region.
136+
After creating a key pair, you will use the key pair name to configure the cluster.
137+
138+
Key pairs are only available to EC2 instances in the same region. We are using us-west-1 in our tutorial, so make sure to creat key pairs in that region (N. California).
138139

139140
#### KMS key
140141

@@ -143,12 +144,12 @@ Amazon KMS keys are used to encrypt and decrypt cluster TLS assets. If you alrea
143144
You can create a KMS key in the AWS console, or with the aws command line tool:
144145

145146
```
146-
$ aws kms --region=us-west-1 create-key --description="kube-aws assets"
147+
aws kms --region=us-west-1 create-key --description="kube-aws assets"
147148
{
148149
"KeyMetadata": {
149150
"CreationDate": 1458235139.724,
150151
"KeyState": "Enabled",
151-
"Arn": "arn:aws:kms:us-west-1:xxxxxxxxx:key/xxxxxxxxxxxxxxxxxxx",
152+
"Arn": "arn:aws:kms:us-west-1:aaaaaaaaaaaaa:key/xxxxxxxxxxxxxxxxxxx",
152153
"AWSAccountId": "xxxxxxxxxxxxx",
153154
"Enabled": true,
154155
"KeyUsage": "ENCRYPT_DECRYPT",
@@ -158,11 +159,11 @@ $ aws kms --region=us-west-1 create-key --description="kube-aws assets"
158159
}
159160
```
160161

161-
You will use the `KeyMetadata.Arn` string to identify your KMS key in the init step.
162+
We will need to use the value of `Arn` later.
162163

163164
And then you need to add several inline policies in your user permission.
164165

165-
Go to AMI user page, click on `Add inline policy` button, and then select `Custom Policy`
166+
Go to IAM user page, click on `Add inline policy` button, and then select `Custom Policy`
166167

167168
paste into following inline policies:
168169

@@ -178,7 +179,7 @@ paste into following inline policies:
178179
"kms:Encrypt"
179180
],
180181
"Resource": [
181-
"arn:aws:kms:*:xxxxxxxxx:key/*"
182+
"arn:aws:kms:*:AWS_ACCOUNT_ID:key/*"
182183
]
183184
},
184185
{
@@ -194,29 +195,37 @@ paste into following inline policies:
194195
"cloudformation:DescribeStackEvents"
195196
],
196197
"Resource": [
197-
"arn:aws:cloudformation:us-west-1:xxxxxxxxx:stack/YOUR_CLUSTER_NAME/*"
198+
"arn:aws:cloudformation:us-west-1:AWS_ACCOUNT_ID:stack/MY_CLUSTER_NAME/*"
198199
]
199200
}
200201
]
201202
}
202203
```
203204

205+
`AWS_ACCOUNT_ID`: You can get it from following command line:
206+
207+
```
208+
aws sts get-caller-identity --output text --query Account
209+
```
210+
211+
`MY_CLUSTER_NAME`: Pick a MY_CLUSTER_NAME that you like, you will use it later as well.
204212

205213
#### External DNS name
206214

207-
When the cluster is created, the controller will expose the TLS-secured API on a public IP address. You will need to create an A record for the external DNS hostname you want to point to this IP address. You can find the API external IP address after the cluster is created by invoking kube-aws status.
215+
When the cluster is created, the controller will expose the TLS-secured API on a DNS name.
216+
217+
The A record of that DNS name needs to be point to the cluster ip address.
218+
219+
We will need to use DNS name later in tutorial. If you don't already own one, you can choose any DNS name (e.g., `paddle`) and modify `/etc/hosts` to associate cluster ip with that DNS name.
208220

209221
#### S3 bucket
210222

211223
You need to create an S3 bucket before startup the Kubernetes cluster.
212224

213-
command (need to have a global unique name):
225+
There are some bugs in aws cli in creating S3 bucket, so let's use the [Web console](https://console.aws.amazon.com/s3/home?region=us-west-1).
214226

215-
```
216-
paddle aws s3api --region=us-west-1 create-bucket --bucket bucket-name
217-
```
227+
Click on `Create Bucket`, fill in a unique BUCKET_NAME, and make sure region is us-west-1 (Northern California).
218228

219-
If you get an error message, try a different bucket name. The bucket name needs to be globally unique.
220229

221230
#### Initialize an asset directory
222231

@@ -230,33 +239,44 @@ $ cd my-cluster
230239
Initialize the cluster CloudFormation stack with the KMS Arn, key pair name, and DNS name from the previous step:
231240

232241
```
233-
$ kube-aws init \
234-
--cluster-name=my-cluster-name \
235-
--external-dns-name=my-cluster-endpoint \
242+
kube-aws init \
243+
--cluster-name=MY_CLUSTER_NAME \
244+
--external-dns-name=MY_EXTERNAL_DNS_NAME \
236245
--region=us-west-1 \
237-
--availability-zone=us-west-1c \
238-
--key-name=key-pair-name \
246+
--availability-zone=us-west-1a \
247+
--key-name=KEY_PAIR_NAME \
239248
--kms-key-arn="arn:aws:kms:us-west-1:xxxxxxxxxx:key/xxxxxxxxxxxxxxxxxxx"
240249
```
241250

242-
Here `us-west-1c` is used for parameter `--availability-zone`, but supported availability zone varies among AWS accounts.
251+
`MY_CLUSTER_NAME`: the one you picked in [KMS key](#kms-key)
243252

244-
Please check if `us-west-1c` is supported by `aws ec2 --region us-west-1 describe-availability-zones`, if not switch to other supported availability zone. (e.g., `us-west-1a`, or `us-west-1b`)
253+
`MY_EXTERNAL_DNS_NAME`: see [External DNS name](#external-dns-name)
254+
255+
`KEY_PAIR_NAME`: see [EC2 key pair](#ec2-key-pair)
256+
257+
`--kms-key-arn`: the "Arn" in [KMS key](#kms-key)
258+
259+
Here `us-west-1a` is used for parameter `--availability-zone`, but supported availability zone varies among AWS accounts.
260+
261+
Please check if `us-west-1a` is supported by `aws ec2 --region us-west-1 describe-availability-zones`, if not switch to other supported availability zone. (e.g., `us-west-1a`, or `us-west-1b`)
262+
263+
Note: please don't use `us-west-1c`. Subnets can currently only be created in the following availability zones: us-west-1b, us-west-1a.
245264

246265
There will now be a cluster.yaml file in the asset directory. This is the main configuration file for your cluster.
247266

267+
248268
#### Render contents of the asset directory
249269

250270
In the simplest case, you can have kube-aws generate both your TLS identities and certificate authority for you.
251271

252272
```
253-
$ kube-aws render credentials --generate-ca
273+
kube-aws render credentials --generate-ca
254274
```
255275

256276
The next command generates the default set of cluster assets in your asset directory.
257277

258278
```
259-
sh $ kube-aws render stack
279+
kube-aws render stack
260280
```
261281

262282
Here's what the directory structure looks like:
@@ -292,15 +312,41 @@ These assets (templates and credentials) are used to create, update and interact
292312

293313
#### Create the instances defined in the CloudFormation template
294314

295-
Now for the exciting part, creating your cluster (choose any `<prefix>`):
315+
Now let's create your cluster (choose any PREFIX for the command below):
296316

297317
```
298-
$ kube-aws up --s3-uri s3://<your-bucket-name>/<prefix>
318+
kube-aws up --s3-uri s3://BUCKET_NAME/PREFIX
299319
```
300320

321+
`BUCKET_NAME`: the bucket name that you used in [S3 bucket](#s3-bucket)
322+
323+
301324
#### Configure DNS
302325

303-
You can invoke `kube-aws status` to get the cluster API endpoint after cluster creation, if necessary. This command can take a while. And use command `dig` to check the load balancer hostname to get the ip address, use this ip to setup an A record for your external dns name.
326+
You can invoke `kube-aws status` to get the cluster API endpoint after cluster creation.
327+
328+
```
329+
$ kube-aws status
330+
Cluster Name: paddle-cluster
331+
Controller DNS Name: paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com
332+
```
333+
334+
Use command `dig` to check the load balancer hostname to get the ip address.
335+
336+
```
337+
$ dig paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com
338+
339+
;; QUESTION SECTION:
340+
;paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com. IN A
341+
342+
;; ANSWER SECTION:
343+
paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com. 59 IN A 54.241.164.52
344+
paddle-cl-ElbAPISe-EEOI3EZPR86C-531251350.us-west-1.elb.amazonaws.com. 59 IN A 54.67.102.112
345+
```
346+
347+
In the above output, both ip `54.241.164.52`, `54.67.102.112` will work.
348+
349+
If you own a DNS name, set the A record to any of the above ip. Otherwise you can edit `/etc/hosts` to associate ip with the DNS name.
304350

305351
#### Access the cluster
306352

0 commit comments

Comments
 (0)