Skip to content

Commit 26795c3

Browse files
Add KEP-5502 for EmptyDir volume sticky bit support
This KEP proposes adding an optional `stickyBit` field to EmptyDirVolumeSource that sets directory permissions to 01777 instead of 0777, preventing users from deleting files they don't own. References: Enhancement issue: #5502 Implementation PR: kubernetes/kubernetes#130277
1 parent 99b9280 commit 26795c3

File tree

3 files changed

+427
-0
lines changed

3 files changed

+427
-0
lines changed
Lines changed: 397 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,397 @@
1+
# KEP-5502: EmptyDir Volume Sticky Bit Support
2+
3+
<!-- toc -->
4+
- [Release Signoff Checklist](#release-signoff-checklist)
5+
- [Summary](#summary)
6+
- [Motivation](#motivation)
7+
- [Goals](#goals)
8+
- [Non-Goals](#non-goals)
9+
- [Proposal](#proposal)
10+
- [User Stories](#user-stories)
11+
- [Story 1: Shared Temporary Storage for Multi-User Workloads](#story-1-shared-temporary-storage-for-multi-user-workloads)
12+
- [Risks and Mitigations](#risks-and-mitigations)
13+
- [Design Details](#design-details)
14+
- [API Changes](#api-changes)
15+
- [Implementation](#implementation)
16+
- [Test Plan](#test-plan)
17+
- [Prerequisite testing updates](#prerequisite-testing-updates)
18+
- [Unit tests](#unit-tests)
19+
- [Integration tests](#integration-tests)
20+
- [e2e tests](#e2e-tests)
21+
- [Graduation Criteria](#graduation-criteria)
22+
- [Alpha](#alpha)
23+
- [Beta](#beta)
24+
- [GA](#ga)
25+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
26+
- [Version Skew Strategy](#version-skew-strategy)
27+
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
28+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
29+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
30+
- [Monitoring Requirements](#monitoring-requirements)
31+
- [Dependencies](#dependencies)
32+
- [Scalability](#scalability)
33+
- [Troubleshooting](#troubleshooting)
34+
- [Implementation History](#implementation-history)
35+
- [Drawbacks](#drawbacks)
36+
- [Alternatives](#alternatives)
37+
- [Alternative 1: Provide more flexible mount options on emptyDir](#alternative-1-provide-more-flexible-mount-options-on-emptydir)
38+
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
39+
<!-- /toc -->
40+
41+
## Release Signoff Checklist
42+
43+
Items marked with (R) are required *prior to targeting to a milestone / release*.
44+
45+
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
46+
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
47+
- [ ] (R) Design details are appropriately documented
48+
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
49+
- [ ] e2e Tests for all Beta API Operations (endpoints)
50+
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
51+
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
52+
- [ ] (R) Graduation criteria is in place
53+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) within one minor version of promotion to GA
54+
- [ ] (R) Production readiness review completed
55+
- [ ] (R) Production readiness review approved
56+
- [ ] "Implementation History" section is up-to-date for milestone
57+
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
58+
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
59+
60+
[kubernetes.io]: https://kubernetes.io/
61+
[kubernetes/enhancements]: https://git.k8s.io/enhancements
62+
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
63+
[kubernetes/website]: https://git.k8s.io/website
64+
65+
## Summary
66+
67+
This KEP proposes adding support for the sticky bit permission (mode 01777) to emptyDir volumes in Kubernetes. The sticky bit is a Unix file permission that restricts file deletion within a directory. Only the file owner, directory owner, or root can delete files, even if all users have write permission. Lack of a sticky bit on directories may result in being unable to use these as temporary directories for security reasons, making it impossible to use emptyDir and having to resort to ephemeral volumes.
68+
69+
## Motivation
70+
71+
The emptyDir volume currently creates directories with mode 0777, allowing any process with write access to delete or rename any file in the volume, regardless of who created it. This behavior can cause problems in multi-user or multi-process workloads where:
72+
73+
1. Multiple containers or processes running as different users share the same emptyDir volume
74+
2. One process accidentally or maliciously deletes files created by another process
75+
3. Init containers and main containers need to share files, but the main container should not be able to delete the init container's files
76+
77+
The sticky bit (mode 01777) is a standard Unix permission that solves this problem by ensuring that only the owner of a file (or the directory owner, or root) can delete or rename it, even when the directory is world-writable.
78+
79+
### Goals
80+
81+
- Add an optional `stickyBit` field to the emptyDir volume specification
82+
- When enabled, create emptyDir volumes with mode 01777 instead of 0777
83+
- Maintain backward compatibility by keeping the default behavior (mode 0777) unchanged
84+
- Support the feature on all platforms that support Unix file permissions
85+
86+
### Non-Goals
87+
88+
- Changing the default behavior of existing emptyDir volumes (mode 0777 remains the default)
89+
- Adding support for other advanced file permission features
90+
- Implementing this feature for volume types other than emptyDir
91+
- Supporting this feature on platforms that don't support Unix-style file permissions (e.g., Windows)
92+
93+
## Proposal
94+
95+
Add a new optional boolean field `stickyBit` to the `EmptyDirVolumeSource` API type. When set to `true`, the kubelet will create the emptyDir volume with mode 01777 (0777 | sticky bit) instead of the default 0777.
96+
97+
### User Stories
98+
99+
#### Story 1: Shared Temporary Storage for Multi-User Workloads
100+
101+
For containerized ruby apps, `/tmp` folders will be rejected if they do not have a sticky bit. This means `emptyDir` cannot be reliably used for tmp folders, and ephemeral volumes (more complex to manage) or RWX volumes have to be used (which are not well supported in many providers).
102+
103+
Allowing emptyDir to be mounted with sticky bit set would tremendously reduce complexity for these applications.
104+
105+
### Risks and Mitigations
106+
107+
**Risk**: Users might not understand the sticky bit behavior and be confused when they cannot delete files created by other users.
108+
109+
**Mitigation**: Document the feature clearly with examples. The feature is opt-in, so users must explicitly enable it.
110+
111+
**Risk**: The feature might not work correctly on all container runtimes or storage backends.
112+
113+
**Mitigation**: The sticky bit is a standard Unix permission supported by all major filesystems. The feature is opt-in (users must explicitly set `stickyBit: true`), allowing for gradual adoption and testing.
114+
115+
**Risk**: Existing workloads might be affected if the default changes.
116+
117+
**Mitigation**: The feature is opt-in via a new API field. Existing workloads will continue to use mode 0777 unless explicitly configured otherwise.
118+
119+
## Design Details
120+
121+
### API Changes
122+
123+
Add a new optional field to the `EmptyDirVolumeSource` struct:
124+
125+
```go
126+
type EmptyDirVolumeSource struct {
127+
// ... existing fields ...
128+
129+
// StickyBit sets the emptyDir permission to 01777 instead of 0777.
130+
// When enabled, only the owner of a file can delete or rename it,
131+
// even if the directory is world-writable.
132+
// This is similar to the /tmp directory behavior on Unix systems.
133+
// +optional
134+
StickyBit *bool `json:"stickyBit,omitempty" protobuf:"varint,3,opt,name=stickyBit"`
135+
}
136+
```
137+
138+
### Implementation
139+
140+
The implementation is in the emptyDir volume plugin in `pkg/volume/emptydir/empty_dir.go`:
141+
142+
1. Define constants for the sticky bit mode:
143+
```go
144+
const (
145+
stickyBitMode os.FileMode = 01000
146+
defaultPerm os.FileMode = 0777
147+
)
148+
```
149+
150+
2. When creating the emptyDir directory, check if the `StickyBit` field is set:
151+
```go
152+
perm := defaultPerm
153+
if ed.stickyBit != nil && *ed.stickyBit {
154+
perm = defaultPerm | stickyBitMode
155+
}
156+
```
157+
158+
3. Apply the appropriate permissions when creating the directory
159+
160+
### Test Plan
161+
162+
[x] I/we understand the owners of the involved components may require updates to
163+
existing tests to make this code solid enough prior to committing the changes necessary
164+
to implement this enhancement.
165+
166+
#### Prerequisite testing updates
167+
168+
No prerequisite testing updates are required. The emptyDir volume plugin already has good test coverage.
169+
170+
#### Unit tests
171+
172+
Unit tests have been added to verify:
173+
- Directory creation with sticky bit enabled results in mode 01777
174+
- Directory creation with sticky bit disabled or unset results in mode 0777
175+
176+
Coverage:
177+
- `pkg/volume/emptydir`: Unit tests cover the sticky bit implementation and default behavior
178+
179+
#### Integration tests
180+
181+
If needed, integration tests could additionally verify:
182+
- A pod with emptyDir volume and stickyBit enabled mounts correctly
183+
- Older kubelets ignore the field gracefully
184+
185+
#### e2e tests
186+
187+
TBD - e2e tests will be added as part of the implementation.
188+
189+
### Graduation Criteria
190+
191+
#### Alpha
192+
193+
- API field implemented and functional
194+
- Unit tests passing
195+
- Documentation available
196+
197+
#### Beta
198+
199+
- No major bugs reported during alpha
200+
- Gather feedback from users
201+
202+
#### GA
203+
204+
- Stable for at least two releases
205+
- No major issues reported
206+
207+
### Upgrade / Downgrade Strategy
208+
209+
No special upgrade/downgrade handling is needed. The `stickyBit` field is optional and ignored by older kubelets that don't recognize it.
210+
211+
### Version Skew Strategy
212+
213+
The feature is kubelet-only. Older kubelets will ignore the `stickyBit` field and create emptyDir volumes with the default mode 0777. This is safe as it matches the previous behavior.
214+
215+
## Production Readiness Review Questionnaire
216+
217+
### Feature Enablement and Rollback
218+
219+
###### How can this feature be enabled / disabled in a live cluster?
220+
221+
- [ ] Feature gate (also fill in values in `kep.yaml`)
222+
- Feature gate name:
223+
- Components depending on the feature gate:
224+
- [x] Other
225+
- Describe the mechanism: The feature is enabled per-volume by setting `stickyBit: true` on an emptyDir volume in the pod spec. No feature gate is required as this is a simple opt-in API field.
226+
- Will enabling / disabling the feature require downtime of the control plane? No
227+
- Will enabling / disabling the feature require downtime or reprovisioning of a node? No
228+
229+
###### Does enabling the feature change any default behavior?
230+
231+
No. The feature only takes effect when users explicitly set `stickyBit: true` on an emptyDir volume. Existing emptyDir volumes and new emptyDir volumes without the field continue to use mode 0777.
232+
233+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
234+
235+
Yes. Since there is no feature gate, the feature is controlled per-pod by setting or omitting the `stickyBit` field. To "disable" the feature, simply remove `stickyBit: true` from pod specs.
236+
237+
If rolling back to an older kubelet version that doesn't support the field, the field will be ignored and emptyDir volumes will be created with mode 0777.
238+
239+
**Impact on existing workloads**: Pods that were running with sticky bit enabled will continue to run unchanged (the directory permissions don't change after creation). However, new pods or pods that are rescheduled will have emptyDir volumes created with mode 0777 instead of 01777, which could affect application behavior if the application relies on the sticky bit behavior.
240+
241+
###### What happens if we reenable the feature if it was previously rolled back?
242+
243+
The feature will work as expected for new pods. Existing pods that were created while the feature was disabled will continue to use mode 0777 until they are deleted and recreated.
244+
245+
###### Are there any tests for feature enablement/disablement?
246+
247+
Yes, unit tests verify that:
248+
- When `stickyBit: true`, the directory is created with mode 01777
249+
- When `stickyBit` is false or unset, the directory is created with mode 0777
250+
- The default behavior (mode 0777) is preserved when the field is not specified
251+
252+
### Rollout, Upgrade and Rollback Planning
253+
254+
###### How can a rollout or rollback fail? Can it impact already running workloads?
255+
256+
**Rollout failure scenarios**:
257+
- If the feature has bugs that cause emptyDir volume creation to fail, pods using `stickyBit: true` will fail to start
258+
- If the host OS or filesystem doesn't support sticky bit (unlikely on standard Linux), volume creation could fail
259+
260+
**Impact on running workloads**:
261+
- Already running workloads are not affected by enabling or disabling the feature
262+
- Only new pods or rescheduled pods are affected
263+
- The feature is opt-in, so workloads that don't use it are unaffected
264+
265+
**Rollback scenarios**:
266+
- Rolling back to an older kubelet is safe and will not affect running pods
267+
- On older kubelets, new pods with `stickyBit: true` will get mode 0777 instead of 01777 (the field is ignored), which is a functional change but not a failure
268+
269+
###### What specific metrics should inform a rollback?
270+
271+
Increased pod startup failures or volume mount errors correlated with pods using `stickyBit: true`.
272+
273+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
274+
275+
Not yet. Will be tested manually before release.
276+
277+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
278+
279+
No.
280+
281+
### Monitoring Requirements
282+
283+
###### How can an operator determine if the feature is in use by workloads?
284+
285+
Operators can:
286+
1. Query the API server for pods with emptyDir volumes that have `stickyBit: true`:
287+
```bash
288+
kubectl get pods -A -o json | jq '.items[] | select(.spec.volumes[]?.emptyDir?.stickyBit == true)'
289+
```
290+
2. Check kubelet logs for messages related to sticky bit creation
291+
3. Inspect pod specifications directly
292+
293+
###### How can someone using this feature know that it is working for their instance?
294+
295+
- [x] Other (treat as last resort)
296+
- Details: Users can verify the feature is working by:
297+
1. Creating a pod with an emptyDir volume with `stickyBit: true`
298+
2. Exec into the pod and check the directory permissions: `ls -ld /path/to/emptydir`
299+
3. Verify the permissions show `drwxrwxrwt` (mode 01777, the 't' at the end indicates sticky bit)
300+
4. Test the behavior by creating a file as one user and attempting to delete it as another user
301+
302+
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
303+
304+
This feature should not affect existing SLOs. The performance impact should be negligible
305+
306+
- emptyDir volume creation time should not be measurably affected
307+
- Pod startup time should not be measurably affected
308+
309+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
310+
311+
- [ ] Metrics
312+
- Metric name: storage_operation_duration_seconds (existing metric)
313+
- Components exposing the metric: kubelet
314+
- This metric can be filtered by operation_name="setup" to track emptyDir volume creation time
315+
316+
Operators should monitor:
317+
- Pod startup failures
318+
- Volume mount failures
319+
- kubelet errors
320+
321+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
322+
323+
No additional metrics are needed. The feature is a simple file permission change and can be observed using existing pod and volume metrics.
324+
325+
### Dependencies
326+
327+
###### Does this feature depend on any specific services running in the cluster?
328+
329+
No. The feature only depends on:
330+
- The host OS supporting the sticky bit permission (standard on all Linux systems)
331+
- The filesystem supporting sticky bit (standard on all major filesystems)
332+
333+
### Scalability
334+
335+
###### Will enabling / using this feature result in any new API calls?
336+
337+
No.
338+
339+
###### Will enabling / using this feature result in introducing new API types?
340+
341+
No. It adds a new field to an existing API type (EmptyDirVolumeSource).
342+
343+
###### Will enabling / using this feature result in any new calls to the cloud provider?
344+
345+
No.
346+
347+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
348+
349+
- API type(s): Pod (EmptyDirVolumeSource)
350+
- Estimated increase in size: One additional boolean field per emptyDir volume that uses the feature, when set
351+
352+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
353+
354+
No. The performance impact should be negligible
355+
356+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
357+
358+
No. The feature only changes one argument to a mkdir system call.
359+
360+
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
361+
362+
No.
363+
364+
### Troubleshooting
365+
366+
###### How does this feature react if the API server and/or etcd is unavailable?
367+
368+
The feature is implemented in the kubelet and does not depend on the API server or etcd after the pod spec has been retrieved.
369+
###### What are other known failure modes?
370+
371+
None beyond the standard emptyDir failure modes.
372+
373+
###### What steps should be taken if SLOs are not being met to determine the problem?
374+
375+
This feature should not affect SLOs. If pod startup or volume mounting SLOs are not being met, check if the affected pods are using `stickyBit: true` and verify kubelet logs for errors.
376+
377+
## Implementation History
378+
379+
- 2025-02-19 Initial implementation started (kubernetes/kubernetes#130277)
380+
- 2025-08-25 KEP issue created (kubernetes/enhancements#5502)
381+
- 2026-01-30: KEP created for alpha in v1.36
382+
383+
## Drawbacks
384+
385+
- Adds a new API field, slightly increasing API surface
386+
- Users unfamiliar with Unix permissions may be confused by sticky bit behavior
387+
- Not supported on Windows (but emptyDir permissions work differently there anyway)
388+
389+
## Alternatives
390+
391+
### Alternative 1: Provide more flexible mount options on emptyDir
392+
393+
There appears to be interested to provide more configuration options for mounting, that could entail setting permissions.
394+
395+
References: https://github.com/kubernetes/enhancements/pull/5856
396+
397+
## Infrastructure Needed (Optional)

0 commit comments

Comments
 (0)