Skip to content

Commit 961da9a

Browse files
committed
first draft k8s tutorials
1 parent 325beb4 commit 961da9a

File tree

4 files changed

+309
-12
lines changed

4 files changed

+309
-12
lines changed

docs/user/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ side-bar.
2828
tutorials/test_generic_ioc
2929
tutorials/support_module
3030
tutorials/setup_k8s
31+
tutorials/setup_k8s_new_beamline
3132
tutorials/rtems_setup
3233
tutorials/rtems_ioc
3334
tutorials/ibek

docs/user/tutorials/setup_k8s.rst

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,10 @@
44
Setup a Kubernetes Server
55
=========================
66

7-
87
.. Note::
98

10-
**DLS Users**: DLS already has the test cluster Pollux which includes
11-
the test beamline p46 and the training beamlines p46 through to p49.
9+
**DLS Users**: DLS already has the test cluster ``Pollux`` which includes
10+
the test beamline p45 and the training beamlines p46 through to p49.
1211

1312
We have also started to roll out production clusters for some of our
1413
beamlines. To date we have clusters for p38, i20, i22 and c01.
@@ -125,18 +124,18 @@ uses a namespace for each beamline or accelerator domain.
125124
A context is a combination of a cluster, namespace, and user. It tells kubectl
126125
which cluster and namespace to use when communicating with the Kubernetes API.
127126

128-
Here we will create a namespace for our first test beamline bl45p. We are
129-
using this name because it is the name of the first ever Kubernetes beamline
127+
Here we will create a namespace for our first test beamline bl46p. We are
128+
using this name because it is the name of the first test Kubernetes beamline
130129
at DLS. This just means I can use some of the following tutorials for both
131130
DLS and non-DLS users.
132131

133132
From the workstation INSIDE the devcontainer execute the following:
134133

135134
.. code-block:: bash
136135
137-
kubectl create namespace bl45p
138-
kubectl config set-context bl45p --namespace=bl45p --user=default --cluster=default
139-
kubectl config use-context bl45p
136+
kubectl create namespace bl46p
137+
kubectl config set-context bl46p --namespace=bl46p --user=default --cluster=default
138+
kubectl config use-context bl46p
140139
141140
Create a service account to run the IOCs
142141
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -152,7 +151,7 @@ Create the account:
152151
apiVersion: v1
153152
kind: ServiceAccount
154153
metadata:
155-
name: bl45p-priv
154+
name: bl46p-priv
156155
EOF
157156
158157
Generate a login token for the account:
@@ -163,9 +162,9 @@ Generate a login token for the account:
163162
apiVersion: v1
164163
kind: Secret
165164
metadata:
166-
name: bl45p-priv-secret
165+
name: bl46p-priv-secret
167166
annotations:
168-
kubernetes.io/service-account.name: bl45p-priv
167+
kubernetes.io/service-account.name: bl46p-priv
169168
type: kubernetes.io/service-account-token
170169
EOF
171170
Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
.. _setup_k8s_beamline:
2+
3+
4+
Create a New Kubernetes Beamline
5+
================================
6+
7+
.. warning::
8+
9+
This is a first draft that has been tested against a DLS test beamline
10+
only. I will remove this warning once it has been tested against:
11+
12+
- the k3s example cluster described in the previous tutorial
13+
- a real DLS beamline.
14+
15+
TODO: would it be better to have a separate tutorial for each of these?
16+
17+
Up until now the tutorials have been deploying IOCs to the local docker or
18+
podman instance on your workstation. In this tutorial we look into setting
19+
up a Kubernetes cluster for a beamline and deploying a test IOC there.
20+
21+
The advantage of using Kubernetes is that it is a production grade container
22+
orchestration system. It will manage the CPU, disk and memory available across
23+
your cluster of nodes, scheduling your IOCs and other services accordingly.
24+
It will also restart them if they fail and monitor their health.
25+
It can provide centralised logging and monitoring
26+
of all of your services including IOCs.
27+
28+
29+
In this tutorial we will create a new beamline in the Kubernetes cluster.
30+
Here we assume that the cluster is already setup and that there is
31+
a namespace configured for use by the beamline. See the previous tutorial
32+
for how to set one up if you do not have this already.
33+
34+
.. note::
35+
36+
DLS users: these instructions are for the BL46P beamline. This beamline
37+
already exists at DLS, so you could just skip ahead to creating the
38+
example IOC. You will need to ask the cloud team for permission on
39+
cluster ``pollux``, namespace ``p46-iocs`` to do this.
40+
Go to this URL to request access:
41+
https://jira.diamond.ac.uk/servicedesk/customer/portal/2/create/92
42+
43+
HOWEVER, these instructions can be also used to setup any
44+
new beamline at DLS - just substitute the beamline name where appropriate.
45+
You will need to have a beamline cluster already created for the
46+
beamline by the cloud team and have requested access via the URL above.
47+
48+
Create a new beamline repository
49+
--------------------------------
50+
51+
To create a new beamline repository, use the template repository at
52+
https://github.com/epics-containers/blxxi-template. Click on the green
53+
"Use this template" button to create a new repository. Name the repository
54+
bl46p (or choose your own name and remember to substitute it in the rest of
55+
this tutorial). Create this repository in your own GitHub account.
56+
57+
.. note::
58+
59+
DLS users: if this is real beamline then it needs to be
60+
created in our internal GitLab registry at
61+
https://gitlab.diamond.ac.uk/controls/containers/beamline.
62+
For this purpose use the template description for `bl38p
63+
<https://github.com/epics-containers/bl38p?tab=readme-ov-file#how-to-create-a-new-beamline--accelerator-domain>`_.
64+
65+
For test DLS beamlines these should still be created in github
66+
as per the below instructions.
67+
68+
Clone the new repository to your local machine and change directory into it.
69+
70+
.. code-block:: bash
71+
72+
git clone https://github.com/YOUR_GITHUB_ACCOUNT/bl46p.git
73+
cd bl46p
74+
75+
Next make some changes to the repository to customise it for your beamline.
76+
Cut and paste the following script to do so.
77+
78+
.. code-block:: bash
79+
80+
BEAMLINE=bl46p
81+
82+
# update the readme
83+
echo "Beamline repo for the beamline $BEAMLINE" > README.md
84+
85+
# remove the sample IOC directory
86+
rm -r iocs/blxxi-ea-ioc-01
87+
# change the services setup scripts to use the new beamline name
88+
sed -i "s/blxxi/$BEAMLINE/g" services/* beamline-chart/values.yaml
89+
90+
Cluster Topologies
91+
------------------
92+
93+
There are two supported topologies for beamline clusters:
94+
95+
- shared cluster with multiple beamlines' IOCs running in the same cluster
96+
- dedicated cluster with a single beamline's IOCs running in the cluster
97+
98+
If you are working with the single node k3s cluster set up in the previous
99+
tutorial then this will be considered a dedicated cluster.
100+
101+
If you are creating a real DLS beamline or accelerator domain then this will
102+
also be a dedicated cluster. You will need to make sure the cloud team has
103+
created the cluster for the beamline and you have permissions to use it.
104+
105+
If you are working with one of the test beamlines at DLS then these are usually
106+
shared topology and are set up as nodes on the Pollux cluster.
107+
108+
Other facilities are free to choose the topology that best suits their needs.
109+
110+
Shared Clusters
111+
~~~~~~~~~~~~~~~
112+
113+
In the shared cluster topology we would usually want IOCs to run on the
114+
servers that are closest to the beamline. This is important for Channel Access
115+
because it is a broadcast protocol and by default only works on a single
116+
subnet.
117+
118+
To facilitate this we use ``node affinity rules`` to ensure that IOCs
119+
run on the beamline's specific nodes. ``Node affinity`` can look for a ``label``
120+
on the node to say that it belongs to a beamline.
121+
We can also use ``taints`` to stop other pods from
122+
running on our beamline nodes. A ``taint`` will stop pods from being scheduled
123+
on a node unless the pod has a matching toleration.
124+
125+
For example the test beamline p46 at DLS has the following ``taints`` and
126+
``labels``:
127+
128+
.. code block::
129+
130+
Labels: beamline=bl46p
131+
nodetype=test-rig
132+
133+
Taints: beamline=bl46p:NoSchedule
134+
nodetype=test-rig:NoSchedule
135+
136+
If you are working with your facility cluster then, you are may not to
137+
have permission to set up these labels and taints. In this case, your
138+
administrator will need to do this for you. At DLS, you should expect that
139+
this is already set up for you.
140+
141+
For an explanation of these K8S concepts see
142+
143+
- `Taints and Tolerances <https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/>`_
144+
- `Node Affinity <https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity-beta-feature>`_
145+
146+
Dedicated Clusters
147+
~~~~~~~~~~~~~~~~~~
148+
149+
In the dedicated cluster topology we would usually want to let the IOCs
150+
run on all of the worker nodes in the cluster. In this case the only thing
151+
that is required is a namespace in which to run your IOCs.
152+
153+
By convention we use a namespace like ``bl46p-iocs`` for this purpose. This
154+
namespace will need the appropriate permissions to allow the IOCs to run
155+
with network host.
156+
157+
Environment Setup
158+
-----------------
159+
160+
Every beamline repository has an ``environment.sh`` file used to configure
161+
your shell so that the command line tools know which cluster to talk to.
162+
Up to this point we have been using the local docker or podman instance,
163+
but here we will configure it to use the beamline cluster.
164+
165+
For the detail of what goes into ``environment.sh`` see
166+
`../reference/environment`.
167+
168+
edit ``environment.sh`` make changes as follows:
169+
170+
Section 1
171+
~~~~~~~~~
172+
173+
to set the following variables:
174+
175+
.. code-block:: bash
176+
177+
export EC_REGISTRY_MAPPING='github.com=ghcr.io'
178+
export EC_K8S_NAMESPACE=p46-iocs
179+
export [email protected]:YOUR_GITHUB_ACCOUNT/bl46p.git
180+
181+
This tells the ``ec`` command line tool to use the GitHub container registry
182+
when it sees github projects, the name of the Kubernetes namespace to use and
183+
the location of the beamline repository.
184+
185+
Section 2
186+
~~~~~~~~~
187+
188+
The script should also make sure that ``ec`` CLI is available and it is also
189+
useful to set up command line completion up. The simplest way to do this is:
190+
191+
.. code-block:: bash
192+
193+
set -e # exit on error
194+
source <(ec --show-completion ${SHELL})
195+
196+
Section 3
197+
~~~~~~~~~
198+
199+
This is where you make sure the cluster is contactable. For the k3s cluster
200+
we set up the default ``~/.kube/config`` file to point to the local cluster.
201+
So we can leave this section blank.
202+
203+
At DLS you would need to load a module to set up the environment for the
204+
beamline cluster. For example:
205+
206+
.. code-block:: bash
207+
208+
module load pollux # for all test beamlines
209+
module load k8s-i22 # for the real beamline i22
210+
211+
Once ``environment.sh`` is set up, source it to set up your shell.
212+
213+
.. code-block:: bash
214+
215+
source environment.sh
216+
217+
You are now ready to start talking to the cluster. You can verify this with
218+
the following command that should list all the nodes on the cluster. You
219+
will be asked for your credentials if required.
220+
221+
.. code-block:: bash
222+
223+
kubectl get nodes
224+
225+
Setting up the Beamline Helm Chart Defaults
226+
-------------------------------------------
227+
228+
The beamline helm chart is used to deploy IOCs to the cluster. Each IOC
229+
gets to override any of the settings available in the chart. However, all
230+
settings except ``image`` have default values supplied at the beamline level.
231+
For this reason most IOC instances only need supply the ``image`` setting
232+
which specifies the Generic IOC container image to use.
233+
234+
Before making the first IOC instance we need to set up the beamline defaults.
235+
These are all held in the file ``beamline-chart/values.yaml``.
236+
237+
Open this file and make the following changes depending on your beamline
238+
type.
239+
240+
All cluster types
241+
~~~~~~~~~~~~~~~~~
242+
243+
.. code-block:: yaml
244+
245+
beamline: bl46p
246+
namespace: p46-iocs
247+
hostNetwork: true # required for channel access access on the host
248+
249+
opisClaim: bl46p-opi-claim
250+
runtimeClaim: bl46p-runtime-claim
251+
autosaveClaim: bl46p-autosave-claim
252+
253+
k3s single server cluster
254+
~~~~~~~~~~~~~~~~~~~~~~~~~
255+
256+
.. code-block:: yaml
257+
258+
dataVolume:
259+
pvc: true
260+
# point at a PVC created by kubernetes
261+
hostPath: /data/
262+
263+
DLS test beamlines
264+
~~~~~~~~~~~~~~~~~~
265+
266+
.. code-block:: yaml
267+
268+
dataVolume:
269+
pvc: true
270+
# point at local disk on the server
271+
hostPath: /exports/mybeamline
272+
273+
# extra tolerations for the training rigs
274+
tolerations:
275+
- key: nodetype
276+
operator: "Equal"
277+
value: training-rig
278+
effect: "NoSchedule"
279+
280+
DLS real beamlines
281+
~~~~~~~~~~~~~~~~~~
282+
283+
.. code-block:: yaml
284+
285+
dataVolume:
286+
pvc: true
287+
# point at the shared filesystem data folder for the beamline
288+
hostPath: /dls/p46/data
289+
290+
Create a Test IOC to Deploy
291+
---------------------------
292+
293+
TODO: WIP (but this looks just like it did in the first IOC deployment tutorial)

docs/user/tutorials/setup_workstation.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,11 @@ will deploy containers to the local workstation's docker or podman instance.
187187
However, everything in these tutorials would also work with Kubernetes. If you
188188
are particularly interested in Kubernetes then you can jump to
189189
`setup_kubernetes` and follow the instructions there. Then come back to this
190-
point and continue with the tutorials.
190+
point and continue with the tutorials. If you do this just be aware that
191+
we use the beamline name ``bl01t`` for local deployment examples and
192+
``bl46p`` for Kubernetes examples so you will need to substitute the
193+
appropriate beamline name for your environment. All the local deployment
194+
examples should also deploy to a Kubernetes cluster.
191195

192196
If you are planning not to use Kubernetes at all then now might be
193197
a good time to install an alternative container management platform such

0 commit comments

Comments
 (0)