Skip to content
Merged
9 changes: 9 additions & 0 deletions copier.yml
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,15 @@ docker:
Would you like to publish your project in a Docker container?
You should select this if you are making a service.

docker_debug:
type: bool
when: "{{ docker }}"
help: |
Would you like to publish a debug image of your service?
This will increase the number of published images, but may
be useful if debugging the service inside of the cluster
infrastructure is required.

docs_type:
type: str
help: |
Expand Down
99 changes: 99 additions & 0 deletions docs/how-to/debug-in-cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Debugging containers

If the `docker_debug` option is chosen, the container build also publishes a debug container for each tagged release of the container suffixed with `-debug`. This container contains an editable install of the workspace & debugpy and has an alternate entrypoint which allows the devcontainer to attach.

# Using Debug image in a Helm chart

⚠️ If running with the Diamond filesystem mounted or as a specific user, further adjustments are required, as described in the next section.

To use the debug image in a Helm chart can be as simple as modifying `image.tag` value in values.yaml to the tag with `-debug`, but this may run into issues if you have defined liveness or readiness probes, a custom command or args, or if the container is running as non-root. To make capturing these edge cases easier it's recommended to define a single flag `debug.enabled` in your `values.yaml` and make the following modifications to the `Deployment|ReplicaSet|StatefulSet`:

```yaml
spec:
template:
spec:
containers:
- image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}{{ ternary "-debug" "" .Values.debug.enabled }}"
{{- if not .Values.debug.enabled }} # If your Helm chart overrides the `CMD` Containerfile instruction, it should not when in debug mode
args: ["some", "example", "args"]
{{- end }}
{{- if not .Values.debug.enabled }} # prevent probes causing issues before attaching and starting the service
{{- with .Values.livenessProbe }}
livenessProbe:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- with .Values.readinessProbe }}
readinessProbe:
{{- toYaml . | nindent 12 }}
{{- end }}
{{- end }}
volumeMounts:
{{- if .Values.debug.enabled }}
- mountPath: /home # required for VSCode to install extensions if running as non-root
name: home
{{- end }}
{{- with .Values.volumeMounts }}
{{- toYaml . | nindent 12 }}
{{- end }}
volumes:
{{- if .Values.debug.enabled }}
- name: home # mount /home as an editable volume to prevent permission issues
emptyDir:
sizeLimit: 500Mi
{{- end }}
{{- with .Values.volumes }}
{{- toYaml . | nindent 8 }}
{{- end }}
```

# Using Debug image in a Helm chart that mounts the filesystem

Containers running in the Diamond Kubernetes infrastructure as a specific uid (e.g. when mounting the filesystem) must provide name resolution from Diamond's LDAP infrastructure: inside the cluster the VSCode server will be running as that user, but requires that the name & home directory of the user can be found. The debug image configures the name lookup service to try finding the user internally (i.e. from `/etc/passwd`) then fall back to calling LDAP through a service called `libnss-ldapd`. As containers are designed to run a single process, this service is run in a sidecar container which must mutually mount the `/var/run/nslcd` socket with the primary container.

It therefore requires the further additions to the template modified above:

```yaml
spec:
template:
spec:
containers:
- volumeMounts:
{{- if .Values.debug.enabled }}
- mountPath: /var/run/nslcd # socket to place query for user information
name: nslcd
[...]
{{- if .Values.debug.enabled }}
- name: debug-account-sync
image: ghcr.io/diamondlightsource/account-sync-sidecar:3.0.0
volumeMounts:
- mountPath: /var/run/nslcd # socket to pick queries for user information
name: nslcd
{{- end }}
volumes:
{{- if .Values.debug.enabled }}
- name: nslcd # mutually mounted filesystem to both containers
emptyDir:
sizeLimit: 5Mi
[...]
```

# Debugging in the cluster

With the [Kubernetes plugin for VSCode](https://marketplace.visualstudio.com/items?itemName=ms-kubernetes-tools.vscode-kubernetes-tools) it is then possible to attach to the container inside the cluster. From the VSCode Command Palette (Ctrl+Shift+P) use the `Kubernetes: Set Kubeconfig` to configure VSCode with the server to use, then`Kubernetes: Use Namespace`.

```sh
# To find the KUBECONFIG to use from a Diamond machine
$ module load pollux
...
$ echo $KUBECONFIG
~/.kube/config_pollux
```

![Location of the Kubernetes plugin in the plugin bar (screen left), with the Clusters>cluster>Workloads>Pods views expanded out to show a pod named "my-service", overlaid with a dropdown box, with "Attach Visual Studio Code" highlighted](../images/debugging-kubernetes.jpg)
The Kubernetes plugin can be found in the plugin bar. Expanding the Clusters>`cluster`>Workloads>Pods views, your service should be visible. Right Click>Attach Visual Studio Code will initiate connecting to the workspace in the cluster. Select your service container from the top menu when prompted.

After the connection to the cluster has been established open the workspace folder by clicking the Explorer option in the plugin bar, the repository will be mounted at `/workspaces/<service name>`, equivalent to when working with a local devcontainer.

Starting your service with the command in the container definition starts it on the node, with access to Kubernetes resources, however it is also now possible to run with or attach a debugger, potentially configured to autoReload code, or to start and stop the service rapidly to implement prospective changes.

After you are happy with the changes, commit them and release a new version of your container. Changes will otherwise not be persisted across container restarts. Your git and ssh config will be mounted inside the devcontainer while connected and for containers on github, the remote `origin` will be configured to use ssh.
Binary file added docs/images/debugging-kubernetes.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions example-answers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ component_lifecycle: experimental
description: An expanded https://github.com/DiamondLightSource/python-copier-template to illustrate how it looks with all the options enabled.
distribution_name: dls-python-copier-template-example
docker: true
docker_debug: true
docs_type: sphinx
git_platform: github.com
github_org: DiamondLightSource
Expand Down
27 changes: 25 additions & 2 deletions template/Dockerfile.jinja
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,33 @@ ENV PATH=/venv/bin:$PATH{% if docker %}

# The build stage installs the context into the venv
FROM developer AS build
COPY . /context
WORKDIR /context
# Requires buildkit 0.17.0
COPY --chmod=o+wrX . /workspaces/{{ repo_name }}
WORKDIR /workspaces/{{ repo_name }}
RUN touch dev-requirements.txt && pip install -c dev-requirements.txt .

{% if docker_debug %}
FROM build AS debug

{% if git_platform=="github.com" %}
# Set origin to use ssh
RUN git remote set-url origin [email protected]:{{github_org}}/{{repo_name}}.git
{% endif %}

# For this pod to understand finding user information from LDAP
RUN apt update
RUN DEBIAN_FRONTEND=noninteractive apt install libnss-ldapd -y
RUN sed -i 's/files/ldap files/g' /etc/nsswitch.conf

# Make editable and debuggable
RUN pip install debugpy
RUN pip install -e .

# Alternate entrypoint to allow devcontainer to attach
ENTRYPOINT [ "/bin/bash", "-c", "--" ]
CMD [ "while true; do sleep 30; done;" ]

{% endif %}
# The runtime stage copies the built venv into a slim runtime container
FROM python:${PYTHON_VERSION}-slim AS runtime
# Add apt-get system dependecies for runtime here if needed
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,16 @@ jobs:
permissions:
contents: read
packages: write
{% endraw %}{% endif %}{% if sphinx %}
{% endraw %}{% if docker_debug %}{% raw %}
debug_container:
needs: [container, test]
uses: ./.github/workflows/_debug_container.yml
with:
publish: ${{ needs.test.result == 'success' }}
permissions:
contents: read
packages: write
{% endraw %}{% endif %}{% endif %}{% if sphinx %}
docs:
uses: ./.github/workflows/_docs.yml

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
on:
workflow_call:
inputs:
publish:
type: boolean
description: If true, pushes image to container registry

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
with:
# Need this to get version number from last tag
fetch-depth: 0

- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3

- name: Log in to GitHub Docker Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Create tags for publishing debug image
id: debug-meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=ref,event=tag,suffix=-debug
type=raw,value=latest-debug

- name: Build and publish debug image to container registry
if: github.ref_type == 'tag'
uses: docker/build-push-action@v6
env:
DOCKER_BUILD_RECORD_UPLOAD: false
with:
context: .
push: true
target: debug
tags: ${{ steps.debug-meta.outputs.tags }}