|
| 1 | +.. _setup_k8s_beamline: |
| 2 | + |
| 3 | + |
| 4 | +Create a New Kubernetes Beamline |
| 5 | +================================ |
| 6 | + |
| 7 | +.. warning:: |
| 8 | + |
| 9 | + This is a first draft that has been tested against a DLS test beamline |
| 10 | + only. I will remove this warning once it has been tested against: |
| 11 | + |
| 12 | + - the k3s example cluster described in the previous tutorial |
| 13 | + - a real DLS beamline. |
| 14 | + |
| 15 | + TODO: would it be better to have a separate tutorial for each of these? |
| 16 | + |
| 17 | +Up until now the tutorials have been deploying IOCs to the local docker or |
| 18 | +podman instance on your workstation. In this tutorial we look into setting |
| 19 | +up a Kubernetes cluster for a beamline and deploying a test IOC there. |
| 20 | + |
| 21 | +The advantage of using Kubernetes is that it is a production grade container |
| 22 | +orchestration system. It will manage the CPU, disk and memory available across |
| 23 | +your cluster of nodes, scheduling your IOCs and other services accordingly. |
| 24 | +It will also restart them if they fail and monitor their health. |
| 25 | +It can provide centralised logging and monitoring |
| 26 | +of all of your services including IOCs. |
| 27 | + |
| 28 | + |
| 29 | +In this tutorial we will create a new beamline in the Kubernetes cluster. |
| 30 | +Here we assume that the cluster is already setup and that there is |
| 31 | +a namespace configured for use by the beamline. See the previous tutorial |
| 32 | +for how to set one up if you do not have this already. |
| 33 | + |
| 34 | +.. note:: |
| 35 | + |
| 36 | + DLS users: these instructions are for the BL46P beamline. This beamline |
| 37 | + already exists at DLS, so you could just skip ahead to creating the |
| 38 | + example IOC. You will need to ask the cloud team for permission on |
| 39 | + cluster ``pollux``, namespace ``p46-iocs`` to do this. |
| 40 | + Go to this URL to request access: |
| 41 | + https://jira.diamond.ac.uk/servicedesk/customer/portal/2/create/92 |
| 42 | + |
| 43 | + HOWEVER, these instructions can be also used to setup any |
| 44 | + new beamline at DLS - just substitute the beamline name where appropriate. |
| 45 | + You will need to have a beamline cluster already created for the |
| 46 | + beamline by the cloud team and have requested access via the URL above. |
| 47 | + |
| 48 | +Create a new beamline repository |
| 49 | +-------------------------------- |
| 50 | + |
| 51 | +To create a new beamline repository, use the template repository at |
| 52 | +https://github.com/epics-containers/blxxi-template. Click on the green |
| 53 | +"Use this template" button to create a new repository. Name the repository |
| 54 | +bl46p (or choose your own name and remember to substitute it in the rest of |
| 55 | +this tutorial). Create this repository in your own GitHub account. |
| 56 | + |
| 57 | +.. note:: |
| 58 | + |
| 59 | + DLS users: if this is real beamline then it needs to be |
| 60 | + created in our internal GitLab registry at |
| 61 | + https://gitlab.diamond.ac.uk/controls/containers/beamline. |
| 62 | + For this purpose use the template description for `bl38p |
| 63 | + <https://github.com/epics-containers/bl38p?tab=readme-ov-file#how-to-create-a-new-beamline--accelerator-domain>`_. |
| 64 | + |
| 65 | + For test DLS beamlines these should still be created in github |
| 66 | + as per the below instructions. |
| 67 | + |
| 68 | +Clone the new repository to your local machine and change directory into it. |
| 69 | + |
| 70 | +.. code-block:: bash |
| 71 | +
|
| 72 | + git clone https://github.com/YOUR_GITHUB_ACCOUNT/bl46p.git |
| 73 | + cd bl46p |
| 74 | +
|
| 75 | +Next make some changes to the repository to customise it for your beamline. |
| 76 | +Cut and paste the following script to do so. |
| 77 | + |
| 78 | +.. code-block:: bash |
| 79 | +
|
| 80 | + BEAMLINE=bl46p |
| 81 | +
|
| 82 | + # update the readme |
| 83 | + echo "Beamline repo for the beamline $BEAMLINE" > README.md |
| 84 | +
|
| 85 | + # remove the sample IOC directory |
| 86 | + rm -r iocs/blxxi-ea-ioc-01 |
| 87 | + # change the services setup scripts to use the new beamline name |
| 88 | + sed -i "s/blxxi/$BEAMLINE/g" services/* beamline-chart/values.yaml |
| 89 | +
|
| 90 | +Cluster Topologies |
| 91 | +------------------ |
| 92 | + |
| 93 | +There are two supported topologies for beamline clusters: |
| 94 | + |
| 95 | +- shared cluster with multiple beamlines' IOCs running in the same cluster |
| 96 | +- dedicated cluster with a single beamline's IOCs running in the cluster |
| 97 | + |
| 98 | +If you are working with the single node k3s cluster set up in the previous |
| 99 | +tutorial then this will be considered a dedicated cluster. |
| 100 | + |
| 101 | +If you are creating a real DLS beamline or accelerator domain then this will |
| 102 | +also be a dedicated cluster. You will need to make sure the cloud team has |
| 103 | +created the cluster for the beamline and you have permissions to use it. |
| 104 | + |
| 105 | +If you are working with one of the test beamlines at DLS then these are usually |
| 106 | +shared topology and are set up as nodes on the Pollux cluster. |
| 107 | + |
| 108 | +Other facilities are free to choose the topology that best suits their needs. |
| 109 | + |
| 110 | +Shared Clusters |
| 111 | +~~~~~~~~~~~~~~~ |
| 112 | + |
| 113 | +In the shared cluster topology we would usually want IOCs to run on the |
| 114 | +servers that are closest to the beamline. This is important for Channel Access |
| 115 | +because it is a broadcast protocol and by default only works on a single |
| 116 | +subnet. |
| 117 | + |
| 118 | +To facilitate this we use ``node affinity rules`` to ensure that IOCs |
| 119 | +run on the beamline's specific nodes. ``Node affinity`` can look for a ``label`` |
| 120 | +on the node to say that it belongs to a beamline. |
| 121 | +We can also use ``taints`` to stop other pods from |
| 122 | +running on our beamline nodes. A ``taint`` will stop pods from being scheduled |
| 123 | +on a node unless the pod has a matching toleration. |
| 124 | + |
| 125 | +For example the test beamline p46 at DLS has the following ``taints`` and |
| 126 | +``labels``: |
| 127 | + |
| 128 | +.. code block:: |
| 129 | +
|
| 130 | + Labels: beamline=bl46p |
| 131 | + nodetype=test-rig |
| 132 | +
|
| 133 | + Taints: beamline=bl46p:NoSchedule |
| 134 | + nodetype=test-rig:NoSchedule |
| 135 | +
|
| 136 | +If you are working with your facility cluster then, you are may not to |
| 137 | +have permission to set up these labels and taints. In this case, your |
| 138 | +administrator will need to do this for you. At DLS, you should expect that |
| 139 | +this is already set up for you. |
| 140 | + |
| 141 | +For an explanation of these K8S concepts see |
| 142 | + |
| 143 | +- `Taints and Tolerances <https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/>`_ |
| 144 | +- `Node Affinity <https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity-beta-feature>`_ |
| 145 | + |
| 146 | +Dedicated Clusters |
| 147 | +~~~~~~~~~~~~~~~~~~ |
| 148 | + |
| 149 | +In the dedicated cluster topology we would usually want to let the IOCs |
| 150 | +run on all of the worker nodes in the cluster. In this case the only thing |
| 151 | +that is required is a namespace in which to run your IOCs. |
| 152 | + |
| 153 | +By convention we use a namespace like ``bl46p-iocs`` for this purpose. This |
| 154 | +namespace will need the appropriate permissions to allow the IOCs to run |
| 155 | +with network host. |
| 156 | + |
| 157 | +Environment Setup |
| 158 | +----------------- |
| 159 | + |
| 160 | +Every beamline repository has an ``environment.sh`` file used to configure |
| 161 | +your shell so that the command line tools know which cluster to talk to. |
| 162 | +Up to this point we have been using the local docker or podman instance, |
| 163 | +but here we will configure it to use the beamline cluster. |
| 164 | + |
| 165 | +For the detail of what goes into ``environment.sh`` see |
| 166 | +`../reference/environment`. |
| 167 | + |
| 168 | +edit ``environment.sh`` make changes as follows: |
| 169 | + |
| 170 | +Section 1 |
| 171 | +~~~~~~~~~ |
| 172 | + |
| 173 | +to set the following variables: |
| 174 | + |
| 175 | +.. code-block:: bash |
| 176 | +
|
| 177 | + export EC_REGISTRY_MAPPING='github.com=ghcr.io' |
| 178 | + export EC_K8S_NAMESPACE=p46-iocs |
| 179 | + export [email protected]:YOUR_GITHUB_ACCOUNT/bl46p.git |
| 180 | +
|
| 181 | +This tells the ``ec`` command line tool to use the GitHub container registry |
| 182 | +when it sees github projects, the name of the Kubernetes namespace to use and |
| 183 | +the location of the beamline repository. |
| 184 | + |
| 185 | +Section 2 |
| 186 | +~~~~~~~~~ |
| 187 | + |
| 188 | +The script should also make sure that ``ec`` CLI is available and it is also |
| 189 | +useful to set up command line completion up. The simplest way to do this is: |
| 190 | + |
| 191 | +.. code-block:: bash |
| 192 | +
|
| 193 | + set -e # exit on error |
| 194 | + source <(ec --show-completion ${SHELL}) |
| 195 | +
|
| 196 | +Section 3 |
| 197 | +~~~~~~~~~ |
| 198 | + |
| 199 | +This is where you make sure the cluster is contactable. For the k3s cluster |
| 200 | +we set up the default ``~/.kube/config`` file to point to the local cluster. |
| 201 | +So we can leave this section blank. |
| 202 | + |
| 203 | +At DLS you would need to load a module to set up the environment for the |
| 204 | +beamline cluster. For example: |
| 205 | + |
| 206 | +.. code-block:: bash |
| 207 | +
|
| 208 | + module load pollux # for all test beamlines |
| 209 | + module load k8s-i22 # for the real beamline i22 |
| 210 | +
|
| 211 | +Once ``environment.sh`` is set up, source it to set up your shell. |
| 212 | + |
| 213 | +.. code-block:: bash |
| 214 | +
|
| 215 | + source environment.sh |
| 216 | +
|
| 217 | +You are now ready to start talking to the cluster. You can verify this with |
| 218 | +the following command that should list all the nodes on the cluster. You |
| 219 | +will be asked for your credentials if required. |
| 220 | + |
| 221 | +.. code-block:: bash |
| 222 | +
|
| 223 | + kubectl get nodes |
| 224 | +
|
| 225 | +Setting up the Beamline Helm Chart Defaults |
| 226 | +------------------------------------------- |
| 227 | + |
| 228 | +The beamline helm chart is used to deploy IOCs to the cluster. Each IOC |
| 229 | +gets to override any of the settings available in the chart. However, all |
| 230 | +settings except ``image`` have default values supplied at the beamline level. |
| 231 | +For this reason most IOC instances only need supply the ``image`` setting |
| 232 | +which specifies the Generic IOC container image to use. |
| 233 | + |
| 234 | +Before making the first IOC instance we need to set up the beamline defaults. |
| 235 | +These are all held in the file ``beamline-chart/values.yaml``. |
| 236 | + |
| 237 | +Open this file and make the following changes depending on your beamline |
| 238 | +type. |
| 239 | + |
| 240 | +All cluster types |
| 241 | +~~~~~~~~~~~~~~~~~ |
| 242 | + |
| 243 | +.. code-block:: yaml |
| 244 | +
|
| 245 | + beamline: bl46p |
| 246 | + namespace: p46-iocs |
| 247 | + hostNetwork: true # required for channel access access on the host |
| 248 | +
|
| 249 | + opisClaim: bl46p-opi-claim |
| 250 | + runtimeClaim: bl46p-runtime-claim |
| 251 | + autosaveClaim: bl46p-autosave-claim |
| 252 | +
|
| 253 | +k3s single server cluster |
| 254 | +~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 255 | + |
| 256 | +.. code-block:: yaml |
| 257 | +
|
| 258 | + dataVolume: |
| 259 | + pvc: true |
| 260 | + # point at a PVC created by kubernetes |
| 261 | + hostPath: /data/ |
| 262 | +
|
| 263 | +DLS test beamlines |
| 264 | +~~~~~~~~~~~~~~~~~~ |
| 265 | + |
| 266 | +.. code-block:: yaml |
| 267 | +
|
| 268 | + dataVolume: |
| 269 | + pvc: true |
| 270 | + # point at local disk on the server |
| 271 | + hostPath: /exports/mybeamline |
| 272 | +
|
| 273 | + # extra tolerations for the training rigs |
| 274 | + tolerations: |
| 275 | + - key: nodetype |
| 276 | + operator: "Equal" |
| 277 | + value: training-rig |
| 278 | + effect: "NoSchedule" |
| 279 | +
|
| 280 | +DLS real beamlines |
| 281 | +~~~~~~~~~~~~~~~~~~ |
| 282 | + |
| 283 | +.. code-block:: yaml |
| 284 | +
|
| 285 | + dataVolume: |
| 286 | + pvc: true |
| 287 | + # point at the shared filesystem data folder for the beamline |
| 288 | + hostPath: /dls/p46/data |
| 289 | +
|
| 290 | +Create a Test IOC to Deploy |
| 291 | +--------------------------- |
| 292 | + |
| 293 | +TODO: WIP (but this looks just like it did in the first IOC deployment tutorial) |
0 commit comments