-
Notifications
You must be signed in to change notification settings - Fork 65
Description
Issue Summary
The render-config init containers in ESS (Element Server
Suite) experience a permission denied error when trying
to write configuration files to an emptyDir volume,
causing pods to enter a crash loop.
Environment Details
- Kubernetes Distribution: MicroK8s v1.33.5
- OS: Ubuntu 24.04.3 LTS
- Kernel: 6.8.0-87-generic
- Container Runtime: containerd://1.7.27
- Architecture: amd64
- Helm Chart: matrix-stack-25.12.0
- Matrix Tools Image:
ghcr.io/element-hq/ess-helm/matrix-tools:0.5.6
Affected Components
- Synapse: ess-synapse-main-0 StatefulSet
- Matrix Authentication Service:
ess-matrix-authentication-service-6644fd87cd-xxxx
Deployment
Error Details
Error writing to file: open /conf/homeserver.yaml:
permission denied
Root Cause Analysis
- Security Context Configuration
Both containers have a restrictive security context:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
- Volume Configuration
The affected volume is an emptyDir with medium: Memory:
volumes:
- name: rendered-config
emptyDir:
medium: Memory
- Volume Mounts
- render-config init container: Mounts /conf (writable)
- synapse container: Mounts /conf/homeserver.yaml
(readOnly, subPath)
- Issue Pattern
The permission issue appears to be related to:
- The combination of readOnlyRootFilesystem: true with
in-memory emptyDir - Potential timing issue where the volume is not
properly initialized with the correct permissions - The containers running as a non-root user (implicit
due to dropped ALL capabilities)
Timeline
- Initial deployment: Dec 6, 23:30:48 UTC (Revision 1)
- Helm upgrade: Dec 7, 00:39:10 UTC (Revision 2) - This
might have triggered the issue - Issue duration: Approximately 15 hours (until manual
intervention)
Workaround Applied
Force-deleting and recreating the affected pods resolved
the issue:
kubectl delete pod -n ess ess-synapse-main-0 --force
--grace-period=0
kubectl delete pod -n ess
ess-matrix-authentication-service-xxx --force
--grace-period=0
Potential Solutions to Consider
- Add explicit user/group to securityContext:
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000 - Add volume permission initialization:
securityContext:
fsGroup: 1000 - Consider using a regular emptyDir instead of memory
medium for debugging - Add init container to set permissions:
- name: set-permissions
image: busybox
command: ['sh', '-c', 'chmod 755 /conf']
volumeMounts:- name: rendered-config
mountPath: /conf
- name: rendered-config