Skip to content

Commit 608cb7b

Browse files
author
Mrunal Patel
committed
Merge pull request #298 from wking/separate-device-cgroups-from-mknod
runtime-config-linux: Separate mknod from cgroups
2 parents 9017a6c + 7d5b027 commit 608cb7b

File tree

2 files changed

+115
-89
lines changed

2 files changed

+115
-89
lines changed

config-linux.md

Lines changed: 94 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -16,27 +16,19 @@ Valid values are the strings for capabilities defined in [the man page](http://m
1616
]
1717
```
1818

19-
## Default Devices and File Systems
19+
## Default File Systems
2020

2121
The Linux ABI includes both syscalls and several special file paths.
2222
Applications expecting a Linux environment will very likely expect these files paths to be setup correctly.
2323

24-
The following devices and filesystems MUST be made available in each application's filesystem
25-
26-
| Path | Type | Notes |
27-
| ------------ | ------ | ------- |
28-
| /proc | [procfs](https://www.kernel.org/doc/Documentation/filesystems/proc.txt) | |
29-
| /sys | [sysfs](https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt) | |
30-
| /dev/null | [device](http://man7.org/linux/man-pages/man4/null.4.html) | |
31-
| /dev/zero | [device](http://man7.org/linux/man-pages/man4/zero.4.html) | |
32-
| /dev/full | [device](http://man7.org/linux/man-pages/man4/full.4.html) | |
33-
| /dev/random | [device](http://man7.org/linux/man-pages/man4/random.4.html) | |
34-
| /dev/urandom | [device](http://man7.org/linux/man-pages/man4/random.4.html) | |
35-
| /dev/tty | [device](http://man7.org/linux/man-pages/man4/tty.4.html) | |
36-
| /dev/console | [device](http://man7.org/linux/man-pages/man4/console.4.html) | |
37-
| /dev/pts | [devpts](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) | |
38-
| /dev/ptmx | [device](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) | Bind-mount or symlink of /dev/pts/ptmx |
39-
| /dev/shm | [tmpfs](https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt) | |
24+
The following filesystems MUST be made available in each application's filesystem
25+
26+
| Path | Type |
27+
| -------- | ------ |
28+
| /proc | [procfs](https://www.kernel.org/doc/Documentation/filesystems/proc.txt) |
29+
| /sys | [sysfs](https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt) |
30+
| /dev/pts | [devpts](https://www.kernel.org/doc/Documentation/filesystems/devpts.txt) |
31+
| /dev/shm | [tmpfs](https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt) |
4032

4133
## Namespaces
4234

@@ -115,93 +107,59 @@ There is a limit of 5 mappings which is the Linux kernel hard limit.
115107

116108
## Devices
117109

118-
`devices` is an array specifying the list of devices to be created in the container.
110+
`devices` is an array specifying the list of devices that MUST be available in the container.
111+
The runtime may supply them however it likes (with [mknod][mknod.2], by bind mounting from the runtime mount namespace, etc.).
119112

120113
The following parameters can be specified:
121114

122-
* **`type`** *(char, required)* - type of device: `c`, `b`, `u` or `p`. More info in `man mknod`.
123-
124-
* **`path`** *(string, optional)* - full path to device inside container
125-
126-
* **`major, minor`** *(int64, required)* - major, minor numbers for device. More info in `man mknod`. There is a special value: `-1`, which means `*` for `device` cgroup setup.
127-
128-
* **`permissions`** *(string, optional)* - cgroup permissions for device. A composition of `r` (*read*), `w` (*write*), and `m` (*mknod*).
129-
130-
* **`fileMode`** *(uint32, optional)* - file mode for device file
131-
132-
* **`uid`** *(uint32, optional)* - uid of device owner
133-
134-
* **`gid`** *(uint32, optional)* - gid of device owner
135-
136-
**`fileMode`**, **`uid`** and **`gid`** are required if **`path`** is given and are otherwise not allowed.
115+
* **`type`** *(char, required)* - type of device: `c`, `b`, `u` or `p`.
116+
More info in [mknod(1)][mknod.1].
117+
* **`path`** *(string, required)* - full path to device inside container.
118+
* **`major, minor`** *(int64, required unless **`type`** is `p`)* - [major, minor numbers][devices] for the device.
119+
* **`fileMode`** *(uint32, optional)* - file mode for the device.
120+
You can also control access to devices [with cgroups](#device-whitelist).
121+
* **`uid`** *(uint32, optional)* - id of device owner.
122+
* **`gid`** *(uint32, optional)* - id of device group.
137123

138124
###### Example
139125

140126
```json
141127
"devices": [
142128
{
143-
"path": "/dev/random",
129+
"path": "/dev/fuse",
144130
"type": "c",
145-
"major": 1,
146-
"minor": 8,
147-
"permissions": "rwm",
131+
"major": 10,
132+
"minor": 229,
148133
"fileMode": 0666,
149134
"uid": 0,
150135
"gid": 0
151136
},
152137
{
153-
"path": "/dev/urandom",
154-
"type": "c",
155-
"major": 1,
156-
"minor": 9,
157-
"permissions": "rwm",
158-
"fileMode": 0666,
159-
"uid": 0,
160-
"gid": 0
161-
},
162-
{
163-
"path": "/dev/null",
164-
"type": "c",
165-
"major": 1,
166-
"minor": 3,
167-
"permissions": "rwm",
168-
"fileMode": 0666,
169-
"uid": 0,
170-
"gid": 0
171-
},
172-
{
173-
"path": "/dev/zero",
174-
"type": "c",
175-
"major": 1,
176-
"minor": 5,
177-
"permissions": "rwm",
178-
"fileMode": 0666,
179-
"uid": 0,
180-
"gid": 0
181-
},
182-
{
183-
"path": "/dev/tty",
184-
"type": "c",
185-
"major": 5,
138+
"path": "/dev/sda",
139+
"type": "b",
140+
"major": 8,
186141
"minor": 0,
187-
"permissions": "rwm",
188-
"fileMode": 0666,
189-
"uid": 0,
190-
"gid": 0
191-
},
192-
{
193-
"path": "/dev/full",
194-
"type": "c",
195-
"major": 1,
196-
"minor": 7,
197-
"permissions": "rwm",
198-
"fileMode": 0666,
142+
"fileMode": 0660,
199143
"uid": 0,
200144
"gid": 0
201145
}
202146
]
203147
```
204148

149+
###### Default Devices
150+
151+
In addition to any devices configured with this setting, the runtime MUST also supply:
152+
153+
* [`/dev/null`][null.4]
154+
* [`/dev/zero`][zero.4]
155+
* [`/dev/full`][full.4]
156+
* [`/dev/random`][random.4]
157+
* [`/dev/urandom`][random.4]
158+
* [`/dev/tty`][tty.4]
159+
* [`/dev/console`][console.4]
160+
* [`/dev/ptmx`][pts.4].
161+
A [bind-mount or symlink of the container's `/dev/pts/ptmx`][devpts].
162+
205163
## Control groups
206164

207165
Also known as cgroups, they are used to restrict resource usage for a container and handle device access.
@@ -228,6 +186,46 @@ You can configure a container's cgroups via the `resources` field of the Linux c
228186
Do not specify `resources` unless limits have to be updated.
229187
For example, to run a new process in an existing container without updating limits, `resources` need not be specified.
230188

189+
#### Device whitelist
190+
191+
`devices` is an array of entries to control the [device whitelist][cgroups-devices].
192+
The runtime MUST apply entries in the listed order.
193+
194+
The following parameters can be specified:
195+
196+
* **`allow`** *(boolean, required)* - whether the entry is allowed or denied.
197+
* **`type`** *(char, optional)* - type of device: `a` (all), `c` (char), or `b` (block).
198+
`null` or unset values mean "all", mapping to `a`.
199+
* **`major, minor`** *(int64, optional)* - [major, minor numbers][devices] for the device.
200+
`null` or unset values mean "all", mapping to [`*` in the filesystem API][cgroups-devices].
201+
* **`access`** *(string, optional)* - cgroup permissions for device.
202+
A composition of `r` (read), `w` (write), and `m` (mknod).
203+
204+
###### Example
205+
206+
```json
207+
"devices": [
208+
{
209+
"allow": false,
210+
"access": "rwm"
211+
},
212+
{
213+
"allow": true,
214+
"type": "c",
215+
"major": 10,
216+
"minor": 229,
217+
"access": "rw"
218+
},
219+
{
220+
"allow": true,
221+
"type": "b",
222+
"major": 8,
223+
"minor": 0,
224+
"access": "r"
225+
}
226+
]
227+
```
228+
231229
#### Disable out-of-memory killer
232230

233231
`disableOOMKiller` contains a boolean (`true` or `false`) that enables or disables the Out of Memory killer for a cgroup.
@@ -587,3 +585,17 @@ Setting `noNewPrivileges` to true prevents the processes in the container from g
587585
```json
588586
"noNewPrivileges": true,
589587
```
588+
589+
[cgroups-devices]: https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt
590+
[devices]: https://www.kernel.org/doc/Documentation/devices.txt
591+
[devpts]: https://www.kernel.org/doc/Documentation/filesystems/devpts.txt
592+
593+
[mknod.1]: http://man7.org/linux/man-pages/man1/mknod.1.html
594+
[mknod.2]: http://man7.org/linux/man-pages/man2/mknod.2.html
595+
[console.4]: http://man7.org/linux/man-pages/man4/console.4.html
596+
[full.4]: http://man7.org/linux/man-pages/man4/full.4.html
597+
[null.4]: http://man7.org/linux/man-pages/man4/null.4.html
598+
[pts.4]: http://man7.org/linux/man-pages/man4/pts.4.html
599+
[random.4]: http://man7.org/linux/man-pages/man4/random.4.html
600+
[tty.4]: http://man7.org/linux/man-pages/man4/tty.4.html
601+
[zero.4]: http://man7.org/linux/man-pages/man4/zero.4.html

config_linux.go

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ type Linux struct {
3333
CgroupsPath *string `json:"cgroupsPath,omitempty"`
3434
// Namespaces contains the namespaces that are created and/or joined by the container
3535
Namespaces []Namespace `json:"namespaces"`
36-
// Devices are a list of device nodes that are created and enabled for the container
36+
// Devices are a list of device nodes that are created for the container
3737
Devices []Device `json:"devices"`
3838
// ApparmorProfile specified the apparmor profile for the container.
3939
ApparmorProfile string `json:"apparmorProfile"`
@@ -213,6 +213,8 @@ type Network struct {
213213

214214
// Resources has container runtime resource constraints
215215
type Resources struct {
216+
// Devices are a list of device rules for the whitelist controller
217+
Devices []DeviceCgroup `json:"devices"`
216218
// DisableOOMKiller disables the OOM killer for out of memory conditions
217219
DisableOOMKiller *bool `json:"disableOOMKiller,omitempty"`
218220
// Specify an oom_score_adj for the container.
@@ -231,7 +233,7 @@ type Resources struct {
231233
Network *Network `json:"network,omitempty"`
232234
}
233235

234-
// Device represents the information on a Linux special device file
236+
// Device represents the mknod information for a Linux special device file
235237
type Device struct {
236238
// Path to the device.
237239
Path string `json:"path"`
@@ -241,14 +243,26 @@ type Device struct {
241243
Major int64 `json:"major"`
242244
// Minor is the device's minor number.
243245
Minor int64 `json:"minor"`
244-
// Cgroup permissions format, rwm.
245-
Permissions string `json:"permissions"`
246246
// FileMode permission bits for the device.
247-
FileMode os.FileMode `json:"fileMode"`
247+
FileMode *os.FileMode `json:"fileMode,omitempty"`
248248
// UID of the device.
249-
UID uint32 `json:"uid"`
249+
UID *uint32 `json:"uid,omitempty"`
250250
// Gid of the device.
251-
GID uint32 `json:"gid"`
251+
GID *uint32 `json:"gid,omitempty"`
252+
}
253+
254+
// DeviceCgroup represents a device rule for the whitelist controller
255+
type DeviceCgroup struct {
256+
// Allow or deny
257+
Allow bool `json:"allow"`
258+
// Device type, block, char, etc.
259+
Type *rune `json:"type,omitempty"`
260+
// Major is the device's major number.
261+
Major *int64 `json:"major,omitempty"`
262+
// Minor is the device's minor number.
263+
Minor *int64 `json:"minor,omitempty"`
264+
// Cgroup access permissions format, rwm.
265+
Access *string `json:"access,omitempty"`
252266
}
253267

254268
// Seccomp represents syscall restrictions

0 commit comments

Comments
 (0)