Skip to content

Kubernetes 1.26.1 - Linux Capabilities - starting container process caused: apply caps: operation not permitted #330

@MysticalMount

Description

@MysticalMount

Describe the bug

Ive deployed the workers to a privileged namespace:

Namespace: cc

apiVersion: v1
kind: Namespace
metadata:
  name: cc
  labels:
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/enforce-version: v1.26
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/audit-version: v1.26
    pod-security.kubernetes.io/warn: privileged
    pod-security.kubernetes.io/warn-version: v1.26

On Kubernetes 1.26.1

When trying to run a hello world pipeline I get this using Guardian inside the worker pod:

{"timestamp":"2023-04-09T16:51:45.884909106Z","level":"error","source":"guardian","message":"guardian.api.garden-server.create.failed","data":{"error":"runc run: exit status 1: container_linux.go:380: starting container process caused: apply caps: operation not permitted","request":{"Handle":"54e0c267-01e1-4e01-690f-df2cff3b5bf8","GraceTime":0,"RootFSPath":"raw:///concourse-work-dir/volumes/live/68c0ccae-e204-453a-6365-4e8b36d6e541/volume","BindMounts":[{"src_path":"/concourse-work-dir/volumes/live/63e92077-b1e9-428d-5172-fca9332f4ac1/volume","dst_path":"/scratch","mode":1}],"Network":"","Privileged":true,"Limits":{"bandwidth_limits":{},"cpu_limits":{},"disk_limits":{},"memory_limits":{},"pid_limits":{}}},"session":"3.1.4548"}

Im fairly new to Concourse, so if Im missing something, sorry!

I can see that securityContext: privileged: true is set on the workers statefulset - in the source YAML and its also seemingly set in the resulting statefulset:

        securityContext:
          capabilities:
            add:
            - all
          privileged: true

(Ive been adding the capabilities to try to resolve the issue)

As far as I can tell the container is privileged - I am also using TalosCtl, but cant find anything, thus far to suggest it it Talos related.

Any steps/help/advice on where to go next or what Ive missed welcome.

Reproduction steps

  1. Deploy Kubernetes v1.26.1
  2. Deploy Helm Chart with mostly default settings with Web and Worker
  3. Connect to web, deploy example pipeline using fly
    ...

Expected behavior

Expected would be the container image to pull and start successfully

Additional context

In my setup Im using custom registries so expect some setup here, but suspect we are hitting this issue pre to that being the problem

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions