Skip to content

Commit cb05c07

Browse files
Merge pull request openshift#8242 from andfasano/agent-day2-cluster-script
AGENT-863: node-joiner cluster script
2 parents be92db7 + 89bcfdf commit cb05c07

File tree

6 files changed

+278
-5
lines changed

6 files changed

+278
-5
lines changed
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Adding a node via the node-joiner tool
2+
3+
## Pre-requisites
4+
1. The `oc` tool must be available in the execution environment (the "user host").
5+
2. The user host has a valid network connection to the target OpenShift cluster to be expanded.
6+
7+
## Setup
8+
1. Download the [node-joiner.sh](./node-joiner.sh) script in a working directory in
9+
the user host (the "assets folder").
10+
2. Create a `nodes-config.yaml` in the assets folder. This configuration file must contain the
11+
list of all the nodes that the user wants to add to the target cluster. At minimum, the name and primary interface MAC address must be specified. For example:
12+
```
13+
hosts:
14+
- hostname: extra-worker-0
15+
interfaces:
16+
- name: eth0
17+
macAddress: 00:02:46:e3:9e:7c
18+
- hostname: extra-worker-1
19+
interfaces:
20+
- name: eth0
21+
macAddress: 00:02:46:e3:9e:8c
22+
- hostname: extra-worker-2
23+
interfaces:
24+
- name: eth0
25+
macAddress: 00:02:46:e3:9e:9c
26+
```
27+
3. Optionally, it's possible to specify - for each node - an `NMState` configuration block denoted below as `networkConfig`
28+
(it will be applied during the first boot), for example:
29+
```
30+
hosts:
31+
- hostname: extra-worker-0
32+
interfaces:
33+
- name: eth0
34+
macAddress: 00:02:46:e3:9e:7c
35+
networkConfig:
36+
interfaces:
37+
- name: eth0
38+
type: ethernet
39+
state: up
40+
mac-address: 00:02:46:e3:9e:7c
41+
ipv4:
42+
enabled: true
43+
address:
44+
- ip: 192.168.111.90
45+
prefix-length: 24
46+
dhcp: false
47+
dns-resolver:
48+
config:
49+
server:
50+
- 192.168.111.1
51+
routes:
52+
config:
53+
- destination: 0.0.0.0/0
54+
next-hop-address: 192.168.111.1
55+
next-hop-interface: eth0
56+
table-id: 254
57+
- hostname: extra-worker-1
58+
interfaces:
59+
- name: eth0
60+
macAddress: 00:02:46:e3:9e:8c
61+
- hostname: extra-worker-2
62+
interfaces:
63+
- name: eth0
64+
macAddress: 00:02:46:e3:9e:9c
65+
66+
## ISO generation
67+
Run the [node-joiner.sh](./node-joiner.sh):
68+
```bash
69+
$ ./node-joiner.sh
70+
```
71+
The script will generate a temporary namespace prefixed with `openshift-node-joiner` in the target cluster,
72+
where a pod will be launched to execute the effective node-joiner workload.
73+
In case of success, the `node.x86_64.iso` ISO image will be downloaded in the assets folder.
74+
75+
### Configuration file name
76+
By default the script looks for a configuration file named `nodes-config.yaml`. It's possible to specify a
77+
different config file name, as the first parameter of the script:
78+
79+
```bash
80+
$ ./node-joiner.sh config.yaml
81+
```
82+
83+
## Nodes joining
84+
Use the iso image to boot all the nodes listed in the configuration file, and wait for the related
85+
certificate signing requests (CSRs) to appear. When adding a new node to the cluster, two pending CSRs will
86+
be generated, and they must be manually approved by the user.
87+
Use the following command to monitor the pending certificates:
88+
```
89+
$ oc get csr
90+
```
91+
User the `oc` `approve` command to approve them:
92+
```
93+
$ oc adm certificate approve <csr_name>
94+
```
95+
Once all the pendings certificates will be approved, then the new node will become available:
96+
```
97+
$ oc get nodes
98+
NAME STATUS ROLES AGE VERSION
99+
extra-worker-0 Ready worker 1h v1.29.3+8628c3c
100+
master-0 Ready control-plane,master 31h v1.29.3+8628c3c
101+
master-1 Ready control-plane,master 32h v1.29.3+8628c3c
102+
master-2 Ready control-plane,master 32h v1.29.3+8628c3c
103+
```
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
#!/bin/bash
2+
3+
set -eu
4+
5+
# Config file
6+
nodesConfigFile=${1:-"nodes-config.yaml"}
7+
if [ ! -f "$nodesConfigFile" ]; then
8+
echo "Cannot find the config file $nodesConfigFile"
9+
exit 1
10+
fi
11+
12+
# Setup a cleanup function to ensure to remove the temporary
13+
# file when the script will be completed.
14+
cleanup() {
15+
if [ -f "$pullSecretFile" ]; then
16+
echo "Removing temporary file $pullSecretFile"
17+
rm "$pullSecretFile"
18+
fi
19+
}
20+
trap cleanup EXIT TERM
21+
22+
# Retrieve the pullsecret and store it in a temporary file.
23+
pullSecretFile=$(mktemp -p "/tmp" -t "nodejoiner-XXXXXXXXXX")
24+
oc get secret -n openshift-config pull-secret -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d > "$pullSecretFile"
25+
26+
# Extract the baremetal-installer image pullspec from the current cluster.
27+
nodeJoinerPullspec=$(oc adm release info --image-for=baremetal-installer --registry-config="$pullSecretFile")
28+
29+
# Use the same random temp file suffix for the namespace.
30+
namespace=$(echo "openshift-node-joiner-${pullSecretFile#/tmp/nodejoiner-}" | tr '[:upper:]' '[:lower:]')
31+
32+
# Create the namespace to run the node-joiner, along with the required roles and bindings.
33+
staticResources=$(cat <<EOF
34+
apiVersion: v1
35+
kind: Namespace
36+
metadata:
37+
name: ${namespace}
38+
---
39+
apiVersion: v1
40+
kind: ServiceAccount
41+
metadata:
42+
name: node-joiner
43+
namespace: ${namespace}
44+
---
45+
apiVersion: rbac.authorization.k8s.io/v1
46+
kind: ClusterRole
47+
metadata:
48+
name: node-joiner
49+
rules:
50+
- apiGroups:
51+
- config.openshift.io
52+
resources:
53+
- clusterversions
54+
- proxies
55+
verbs:
56+
- get
57+
- apiGroups:
58+
- ""
59+
resources:
60+
- secrets
61+
- configmaps
62+
- nodes
63+
verbs:
64+
- get
65+
- list
66+
---
67+
apiVersion: rbac.authorization.k8s.io/v1
68+
kind: ClusterRoleBinding
69+
metadata:
70+
name: node-joiner
71+
subjects:
72+
- kind: ServiceAccount
73+
name: node-joiner
74+
namespace: ${namespace}
75+
roleRef:
76+
kind: ClusterRole
77+
name: node-joiner
78+
apiGroup: rbac.authorization.k8s.io
79+
EOF
80+
)
81+
echo "$staticResources" | oc apply -f -
82+
83+
# Generate a configMap to store the user configuration
84+
oc create configmap nodes-config --from-file=nodes-config.yaml="${nodesConfigFile}" -n "${namespace}" -o yaml --dry-run=client | oc apply -f -
85+
86+
# Run the node-joiner pod to generate the ISO
87+
nodeJoinerPod=$(cat <<EOF
88+
apiVersion: v1
89+
kind: Pod
90+
metadata:
91+
name: node-joiner
92+
namespace: ${namespace}
93+
annotations:
94+
openshift.io/scc: anyuid
95+
labels:
96+
app: node-joiner
97+
spec:
98+
restartPolicy: Never
99+
serviceAccountName: node-joiner
100+
securityContext:
101+
seccompProfile:
102+
type: RuntimeDefault
103+
containers:
104+
- name: node-joiner
105+
imagePullPolicy: IfNotPresent
106+
image: $nodeJoinerPullspec
107+
volumeMounts:
108+
- name: nodes-config
109+
mountPath: /config
110+
- name: assets
111+
mountPath: /assets
112+
command: ["/bin/sh", "-c", "cp /config/nodes-config.yaml /assets; HOME=/assets node-joiner add-nodes --dir=/assets --log-level=debug; sleep 600"]
113+
volumes:
114+
- name: nodes-config
115+
configMap:
116+
name: nodes-config
117+
namespace: ${namespace}
118+
- name: assets
119+
emptyDir:
120+
sizeLimit: "4Gi"
121+
EOF
122+
)
123+
echo "$nodeJoinerPod" | oc apply -f -
124+
125+
while true; do
126+
if oc exec node-joiner -n "${namespace}" -- test -e /assets/exit_code >/dev/null 2>&1; then
127+
break
128+
else
129+
echo "Waiting for node-joiner pod to complete..."
130+
sleep 10s
131+
fi
132+
done
133+
134+
res=$(oc exec node-joiner -n "${namespace}" -- cat /assets/exit_code)
135+
if [ "$res" = 0 ]; then
136+
echo "node-joiner successfully completed, extracting ISO image..."
137+
oc cp -n "${namespace}" node-joiner:/assets/node.x86_64.iso node.x86_64.iso
138+
else
139+
oc logs node-joiner -n "${namespace}"
140+
echo "node-joiner failed"
141+
fi
142+
143+
echo "Cleaning up"
144+
oc delete namespace "${namespace}" --grace-period=0 >/dev/null 2>&1 &

images/baremetal/Dockerfile.ci

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ ARG TAGS="baremetal fipscapable"
66
WORKDIR /go/src/github.com/openshift/installer
77
COPY . .
88
RUN DEFAULT_ARCH="$(go env GOHOSTARCH)" hack/build.sh
9+
RUN DEFAULT_ARCH="$(go env GOHOSTARCH)" hack/build-node-joiner.sh
910

11+
FROM registry.ci.openshift.org/ocp/4.16:cli-artifacts AS tools
1012

1113
FROM registry.ci.openshift.org/ocp/4.16:base
1214
COPY --from=builder /go/src/github.com/openshift/installer/bin/openshift-install /bin/openshift-install
@@ -16,6 +18,11 @@ RUN dnf upgrade -y && \
1618
openssl unzip jq openssh-clients && \
1719
dnf clean all && rm -rf /var/cache/yum/*
1820

21+
# node-joiner requirements
22+
COPY --from=builder /go/src/github.com/openshift/installer/bin/node-joiner /bin/node-joiner
23+
COPY --from=tools /usr/bin/oc /bin/oc
24+
RUN dnf install -y nmstate
25+
1926
RUN mkdir /output && chown 1000:1000 /output
2027
USER 1000:1000
2128
ENV PATH /bin

images/installer/Dockerfile.ci

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,15 @@ COPY --from=providers /go/src/github.com/openshift/installer/terraform/bin/ terr
1313
RUN DEFAULT_ARCH="$(go env GOHOSTARCH)" hack/build.sh
1414
RUN go run -mod=vendor hack/build-coreos-manifest.go
1515

16+
FROM registry.ci.openshift.org/ocp/4.16:cli-artifacts AS tools
1617

1718
FROM registry.ci.openshift.org/ocp/4.16:base
1819
COPY --from=builder /go/src/github.com/openshift/installer/bin/openshift-install /bin/openshift-install
1920
COPY --from=builder /go/src/github.com/openshift/installer/bin/manifests/ /manifests/
21+
# Required to run agent-based installer from the container
22+
COPY --from=tools /usr/bin/oc /bin/oc
23+
RUN dnf install -y nmstate
24+
2025
RUN mkdir /output && chown 1000:1000 /output
2126
USER 1000:1000
2227
ENV PATH /bin

pkg/asset/agent/image/agentimage.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ import (
2121

2222
const (
2323
agentISOFilename = "agent.%s.iso"
24-
agentAddNodesISOFilename = "agent-addnodes.%s.iso"
24+
agentAddNodesISOFilename = "node.%s.iso"
2525
iso9660Level1ExtLen = 3
2626
)
2727

pkg/nodejoiner/addnodes.go

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ package nodejoiner
22

33
import (
44
"context"
5+
"os"
6+
"path/filepath"
57

68
"github.com/openshift/installer/pkg/asset"
79
"github.com/openshift/installer/pkg/asset/agent/image"
@@ -10,6 +12,10 @@ import (
1012
"github.com/openshift/installer/pkg/asset/store"
1113
)
1214

15+
const (
16+
addNodesResultFile = "exit_code"
17+
)
18+
1319
// NewAddNodesCommand creates a new command for add nodes.
1420
func NewAddNodesCommand(directory string, kubeConfig string) error {
1521
// Store the current parameters into the assets folder, so
@@ -22,12 +28,20 @@ func NewAddNodesCommand(directory string, kubeConfig string) error {
2228
return err
2329
}
2430

25-
ctx := context.Background()
26-
2731
fetcher := store.NewAssetsFetcher(directory)
28-
return fetcher.FetchAndPersist(ctx, []asset.WritableAsset{
32+
err = fetcher.FetchAndPersist(context.Background(), []asset.WritableAsset{
2933
&workflow.AgentWorkflowAddNodes{},
3034
&image.AgentImage{},
31-
// To be completed
3235
})
36+
37+
// Save the exit code result
38+
exitCode := "0"
39+
if err != nil {
40+
exitCode = "1"
41+
}
42+
if err2 := os.WriteFile(filepath.Join(directory, addNodesResultFile), []byte(exitCode), 0600); err2 != nil {
43+
return err2
44+
}
45+
46+
return err
3347
}

0 commit comments

Comments
 (0)